libutf8proc.git - UTF8 Processing library (import)

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'master' of https://github.com/JuliaLang/utf8proc	Steven G. Johnson	2018-07-24	2	-4/+20
\|\
\| *	update data and algorithms for Unicode 11 (#140)	Steven G. Johnson	2018-07-24	2	-4/+20
\| \|
* \|	update copyright statement for data_generator	Steven G. Johnson	2018-07-24	1	-0/+2
\|/
*	charwidth=1 for soft hyphen and unassigned codepoints (#135)	Steven G. Johnson	2018-07-24	2	-13/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* use width=1 for soft hyphen and for unassigned/PUA codepoints * don't count unassigned codepoints when comparing with system wcwidth * more tests * indentation fixes * NEWS for 135 * remove special-casing for arabic control characters affecting a span of numbers, which are sometimes zero-width and sometimes not * regenerate
*	uppercase mapping ß (U+00df) to ẞ (U+1E9E) (#134)	Steven G. Johnson	2018-05-02	1	-13/+13
\| \| \| \| \| \| \| \|	* uppercase(0x00df) = 0x1e9e * tests for titlecase and u+00df uppercase * NEWS, another test
*	Case folding fixes (#133)	Steven G. Johnson	2018-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fixes allowing for “Full” folding and NFKC_CaseFold compliance. * Only include C (Common) and F (Full) foldings from CaseFolding.txt. Removed S (Simple) since F & S are specified to be exclusive. * Extend UTF8PROC_IGNORE to also ignore unassigned codepoints (such as \u2065) which are specified as being discarded by NFKC_CF. * Document the changes to UTF8PROC_IGNORE in header. * Add NFKC_CF helper function with documentation. * restore old IGNORE behavior, add UTF8PROC_STRIPNA, rename to utf8proc_NFKC_Casefold, add a test * success message * test that IGNORE does not strip NA * data update * NFKC_Casefold shouldn't strip NA
*	update to unicode 10 (#132)	Steven G. Johnson	2018-04-27	2	-14/+3
\|
*	Ensure generated const data tables are hidden via "static" (#100)	Paul Smith	2017-02-19	1	-5/+5
\|
*	Don't use cached version of UnicodeData.txt (#92)	Michael Hatherly	2017-01-03	1	-1/+1
\| \| \| \|	Ref: https://github.com/JuliaLang/julia/pull/19725, UnicodeData.txt is now being cached in JuliaLang/julia's build.
*	update to unifont 9.0.04	Steven G. Johnson	2016-12-11	1	-1/+1
\|
*	Use versioned Unicode data URLs (#78)	petercolberg	2016-07-13	1	-8/+11
\| \| \|	This ensures the tests keep working when a new Unicode version is released.
*	update to Unifont 9 (for Unicode 9 charwidths) (#75)	Steven G. Johnson	2016-07-12	1	-1/+1
\|
*	Smaller tables (#68)	Benito van der Zander	2016-07-12	1	-40/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* convert sequences to utf-16 (saves 25kb) * store sequence length in properties instead using -1 termination (saves 10kb) * cache index for slightly faster data creation * store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time * change combination array data type to uint16 (saves 40kb) * merge 1st and 2nd comb index (saves 50kb) * kill empty prefix/suffix in combination array (saves 50kb) * there was no need to have a separate combination start array, it can be merged in a single array * some fixes * mark the table as const again * and regen
*	Unicode 9 updates (#70)	Keno Fischer	2016-06-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Updates for Unicode 9.0.0 TR29 Changes - New rules GB10/(12/13) are used to combine emoji-zwj sequences/ (force grapheme breaks every two RI codepoints). Unfortunately this breaks statelessness of grapheme-boundary determination. Deal with this by ignoring the problem in utf8proc_grapheme_break, and by hacking in a special case in decompose - ZWJ moved to its own boundclass, update what is now GB9 accordingly. - Add comments to indicate which rule a given case implements - The Number of bound classes Now exceeds 4 bits, expand to 8 and reorganize fields * Import Unicode 9 data * Update Grapheme break API to expose state override * Bump MAJOR version
*	Reduce the size of the binary.	Michaël Meyer	2015-12-09	1	-3/+4
\| \| \| \| \|	Use integers instead of pointers in Unicode tables. Saves 226 kb / 716 kb in the compiled library.
*	Fix deprecated warnings with Julia 0.4	Peter Colberg	2015-10-31	1	-6/+12
\|
*	Add missing files to `make clean`	Peter Colberg	2015-10-30	1	-1/+2
\|
*	Set URLCACHE to JuliaLang cache server for Travis builds	Peter Colberg	2015-10-29	1	-6/+2
\| \| \| \| \| \|	Download Unicode data from upstream server by default. Download GNU Unifont from reliable GNU mirror by default.
*	Generate fontforge font files in makefile	Peter Colberg	2015-10-29	2	-13/+10
\| \| \| \| \| \| \| \| \| \| \|	Revise the script to directly read fontforge font files, which are generated in the makefile. This permits overriding the fontforge path during the build, and executing fontforge in parallel with make -j. Avoid duplicating download URLs in the script, which ensures that the script itself works without network access, e.g., when downloading the data files on a developer machine with network access and executing the script on a build machine without network access.
*	Update Unifont to 8.0.01	Jiahao Chen	2015-06-29	1	-2/+2
\|
*	sort keys to try to eliminate data dependence on Ruby version	Steven G. Johnson	2015-06-25	1	-2/+2
\|
*	fix #46 (make sure symbol-like codepoints have nonzero width even if they ↵	Steven G. Johnson	2015-06-24	1	-12/+41
\| \| \| \|	aren't in Unifont)
*	Minor fixes to work with Unicode 8.0.0 data	Jiahao Chen	2015-06-20	1	-2/+3
\|
*	Prefix other C99 typedefs with utf8proc_	Tony Kelman	2015-04-06	1	-4/+4
\|
*	fix #2: add charwidth function	Steven G. Johnson	2015-03-12	3	-4/+237
\|
*	directory cleanup: move tests and data into subdirectories	Steven G. Johnson	2015-03-06	1	-0/+317