summaryrefslogtreecommitdiff
path: root/lump.md
diff options
context:
space:
mode:
authorKeno Fischer <kfischer+github@college.harvard.edu>2016-06-28 16:04:25 -0400
committerSteven G. Johnson <stevenj@mit.edu>2016-06-28 16:04:25 -0400
commit41c6b23aab330d019789bf1fbb870c7e74e703bf (patch)
tree15c109853d5d5dba78b0002897368501b94d2fc0 /lump.md
parent3d0576a9b9669a6e9fd170ffba3d3838d46986df (diff)
downloadlibutf8proc-41c6b23aab330d019789bf1fbb870c7e74e703bf.tar.gz
libutf8proc-41c6b23aab330d019789bf1fbb870c7e74e703bf.tar.bz2
Unicode 9 updates (#70)
* Updates for Unicode 9.0.0 TR29 Changes - New rules GB10/(12/13) are used to combine emoji-zwj sequences/ (force grapheme breaks every two RI codepoints). Unfortunately this breaks statelessness of grapheme-boundary determination. Deal with this by ignoring the problem in utf8proc_grapheme_break, and by hacking in a special case in decompose - ZWJ moved to its own boundclass, update what is now GB9 accordingly. - Add comments to indicate which rule a given case implements - The Number of bound classes Now exceeds 4 bits, expand to 8 and reorganize fields * Import Unicode 9 data * Update Grapheme break API to expose state override * Bump MAJOR version
Diffstat (limited to 'lump.md')
0 files changed, 0 insertions, 0 deletions