Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Build with core netsurf buildsystemrelease/2.2.0-1 | Vincent Sanders | 2018-08-23 | 1 | -771/+0 |
| | |||||
* | doc fixes, don't export stdint and limits.h values UINT16_MAX and SSIZE_MAX | Steven G. Johnson | 2018-07-24 | 1 | -0/+8 |
| | |||||
* | Merge branch 'master' of https://github.com/JuliaLang/utf8proc | Steven G. Johnson | 2018-07-24 | 1 | -12/+11 |
|\ | |||||
| * | update data and algorithms for Unicode 11 (#140) | Steven G. Johnson | 2018-07-24 | 1 | -12/+11 |
| | | |||||
* | | copyright year updates | Steven G. Johnson | 2018-07-24 | 1 | -1/+1 |
|/ | |||||
* | Case folding fixes (#133) | Steven G. Johnson | 2018-05-02 | 1 | -0/+10 |
| | | | | | | | | | | | | | | | | | | | | | * Fixes allowing for “Full” folding and NFKC_CaseFold compliance. * Only include C (Common) and F (Full) foldings from CaseFolding.txt. Removed S (Simple) since F & S are specified to be exclusive. * Extend UTF8PROC_IGNORE to also ignore unassigned codepoints (such as \u2065) which are specified as being discarded by NFKC_CF. * Document the changes to UTF8PROC_IGNORE in header. * Add NFKC_CF helper function with documentation. * restore old IGNORE behavior, add UTF8PROC_STRIPNA, rename to utf8proc_NFKC_Casefold, add a test * success message * test that IGNORE does not strip NA * data update * NFKC_Casefold shouldn't strip NA | ||||
* | possible fix for #128 (#129) | Benito van der Zander | 2018-04-27 | 1 | -3/+3 |
| | | | Does this help? I do not really remember what I wrote back then | ||||
* | Fix declaration-after-statement warning when compiling in strict C90 mode. ↵ | Branko Čibej | 2017-09-21 | 1 | -3/+2 |
| | | | | (#113) | ||||
* | new utf8proc_map_custom for hooking in user-defined custom mappings (#89) | Steven G. Johnson | 2016-11-30 | 1 | -4/+20 |
| | | | | | | | | | | * new utf8proc_map_custom for hooking in user-defined custom mappings * whoops, add test program * NEWS, version bump for 2.1 * change test functions to static so that gcc doesn't complain about missing prototypes | ||||
* | silence MSVC warning about conversion to uint8 (fix #86) | Steven G. Johnson | 2016-11-30 | 1 | -23/+22 |
| | |||||
* | Tlsa/ucs4 normalize (#88) | Michael Drake | 2016-11-21 | 1 | -3/+10 |
| | | | | | | | | | | | | | | | | | | | * Split codepoint sequence normalisation out into separate function. This creates utf8proc_normalize_utf32() which takes and returns a UTF-32 string, applying the following options: - UTF8PROC_NLF2LS - UTF8PROC_NLF2PS - UTF8PROC_NLF2LF - UTF8PROC_STRIPCC - UTF8PROC_COMPOSE - UTF8PROC_STABLE The utf8proc_reencode() function has been updated to call the new utf8proc_normalize_utf32(). * Update code documentation: utf8proc_reencode handles UTF8PROC_CHARBOUND. | ||||
* | Fix incorrect use of `lbc` instead of `lbc_override` (#77) | Keno Fischer | 2016-07-13 | 1 | -1/+1 |
| | |||||
* | Walk back ABI breaking changes (#76) | Keno Fischer | 2016-07-13 | 1 | -1/+7 |
| | |||||
* | Smaller tables (#68) | Benito van der Zander | 2016-07-12 | 1 | -37/+70 |
| | | | | | | | | | | | | | | | | | | | | | | | | * convert sequences to utf-16 (saves 25kb) * store sequence length in properties instead using -1 termination (saves 10kb) * cache index for slightly faster data creation * store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time * change combination array data type to uint16 (saves 40kb) * merge 1st and 2nd comb index (saves 50kb) * kill empty prefix/suffix in combination array (saves 50kb) * there was no need to have a separate combination start array, it can be merged in a single array * some fixes * mark the table as const again * and regen | ||||
* | Unicode 9 updates (#70) | Keno Fischer | 2016-06-28 | 1 | -30/+80 |
| | | | | | | | | | | | | | | | | | | | | | | | * Updates for Unicode 9.0.0 TR29 Changes - New rules GB10/(12/13) are used to combine emoji-zwj sequences/ (force grapheme breaks every two RI codepoints). Unfortunately this breaks statelessness of grapheme-boundary determination. Deal with this by ignoring the problem in utf8proc_grapheme_break, and by hacking in a special case in decompose - ZWJ moved to its own boundclass, update what is now GB9 accordingly. - Add comments to indicate which rule a given case implements - The Number of bound classes Now exceeds 4 bits, expand to 8 and reorganize fields * Import Unicode 9 data * Update Grapheme break API to expose state override * Bump MAJOR version | ||||
* | Fix overrun | Michaël Meyer | 2016-02-04 | 1 | -1/+1 |
| | |||||
* | Reduce the size of the binary. | Michaël Meyer | 2015-12-09 | 1 | -4/+4 |
| | | | | | Use integers instead of pointers in Unicode tables. Saves 226 kb / 716 kb in the compiled library. | ||||
* | Silence warning with -Wextra | Federico G. Schwindt | 2015-11-24 | 1 | -1/+2 |
| | | | | Fixes #60. | ||||
* | update copyright statements to list recent contributors and year | Steven G. Johnson | 2015-11-01 | 1 | -0/+1 |
| | |||||
* | Do not export internal unsafe_encode_char() | Peter Colberg | 2015-10-29 | 1 | -1/+1 |
| | |||||
* | add toupper/tolower functions (for JuliaLang/julia#11471) | Steven G. Johnson | 2015-05-29 | 1 | -0/+12 |
| | |||||
* | Fix #34 handle 66 Unicode non-characters, also improve performance and ↵ | Scott Paul Jones | 2015-05-29 | 1 | -52/+92 |
| | | | | surrogate handling | ||||
* | Prefix other C99 typedefs with utf8proc_ | Tony Kelman | 2015-04-06 | 1 | -49/+49 |
| | |||||
* | Use a new typedef utf8proc_ssize_t to avoid define collisions | Tony Kelman | 2015-04-05 | 1 | -23/+23 |
| | | | | with MSVC | ||||
* | rename DLLEXPORT to UTF8PROC_DLLEXPORT to prevent conflicts with other ↵ | Steven G. Johnson | 2015-03-30 | 1 | -19/+19 |
| | | | | header files that define DLLEXPORT | ||||
* | updated NEWS etc. for 1.2 releasev1.2 | Steven G. Johnson | 2015-03-28 | 1 | -1/+1 |
| | |||||
* | indentation consistency | Steven G. Johnson | 2015-03-27 | 1 | -30/+31 |
| | |||||
* | put the API version as #defines in the header file (as discussed in #30) | Steven G. Johnson | 2015-03-27 | 1 | -1/+3 |
| | |||||
* | Fix #26: use doxygen for generating API docs | Jonas Fonseca | 2015-03-21 | 1 | -5/+5 |
| | |||||
* | remove requirement that get_property and decompose_char argument be in range ↵ | Steven G. Johnson | 2015-03-12 | 1 | -10/+14 |
| | | | | 0x0 to 0x10ffff | ||||
* | fix #2: add charwidth function | Steven G. Johnson | 2015-03-12 | 1 | -0/+15 |
| | |||||
* | rename back to utf8proc now that we are taking over maintenance | Steven G. Johnson | 2015-03-06 | 1 | -7/+5 |
| | |||||
* | update graphemes for Unicode 7, add utf8proc_grapheme_break function | Steven G. Johnson | 2014-12-12 | 1 | -55/+34 |
| | |||||
* | utf8proc_version should return a different version string than utf8proc | Steven G. Johnson | 2014-12-12 | 1 | -2/+6 |
| | |||||
* | Merge pull request #17 from JuliaLang/tk/dllexport | Steven G. Johnson | 2014-09-24 | 1 | -15/+15 |
|\ | | | | | RFC: add DLLEXPORT to utf8proc_get_property | ||||
| * | add DLLEXPORT to all functions in mojibake.h | Tony Kelman | 2014-09-22 | 1 | -15/+15 |
| | | |||||
* | | a few typofixes | Veres Lajos | 2014-08-12 | 1 | -2/+2 |
|/ | |||||
* | utf8proc.h -> mojibake.h (closes #10) | Steven G. Johnson | 2014-07-18 | 1 | -1/+1 |
| | |||||
* | C++/MSVC compatibility, indenting, for #4 | Steven G. Johnson | 2014-07-18 | 1 | -2/+2 |
| | |||||
* | import of utf8proc-v1.1.6v1.1.6 | Steven G. Johnson | 2014-07-15 | 1 | -0/+587 |