|author||Steven G. Johnson <firstname.lastname@example.org>||2014-07-15 16:04:36 -0400|
|committer||Steven G. Johnson <email@example.com>||2014-07-15 16:04:36 -0400|
markdown and other cosmetic updates
Diffstat (limited to 'README.md')
1 files changed, 68 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
@@ -0,0 +1,68 @@
+== libutf8proc ==
+The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
+a lightly updated fork of the [utf8proc
+library](http://www.public-software-group.org/utf8proc) from Jan
+Behrens and the rest of the [Public Software
+Group](http://www.public-software-group.org/), who deserve *nearly all
+of the credit* for this package: a small, clean C library that
+provides Unicode normalization, case-folding, and other operations for
+data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8).
+The reason for this fork is that utf8proc is used for basic Unicode
+support in the [Julia language](http://julialang.org/) and the Julia
+developers wanted Unicode 7 support and other features, but the
+Public Software Group currently does not seem to have the resources
+necessary to update utf8proc. We hope that the fork can be merged
+back into the mainline utf8proc package before too long.
+(The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
+We removed those from libutf8proc in order to focus exclusively on the C
+library for the time being. We will strive to keep API changes to a minimum,
+so libutf8proc should still be usable with the old plug-in code.)
+Like utf8proc, the libutf8proc package is licensed under the
+free/open-source [MIT "expat"
+license](http://opensource.org/licenses/MIT) (plus certain Unicode
+data governed by the similarly permissive [Unicode data
+license](http://www.unicode.org/copyright.html#Exhibit1)); please see
+the included `LICENSE.md` file for more detailed information.
+=== Quick Start ===
+For compilation of the C library run `make`.
+=== General Information ===
+The C library is found in this directory after successful compilation
+and is named `libutf8proc.a` (for the static library) and
+`libutf8proc.so` (for the dynamic library).
+The Unicode version being supported is 5.0.0.
+*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as
+version 5.0.0 had not been available at the time of implementation.
+For Unicode normalizations, the following options are used:
+* Normalization Form C: `STABLE`, COMPOSE`
+* Normalization Form D: `STABLE`, `DECOMPOSE`
+* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
+* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
+=== C Library ===
+The documentation for the C library is found in the `utf8proc.h` header file.
+`utf8proc_map` is function you will most likely be using for mapping UTF-8
+strings, unless you want to allocate memory yourself.
+=== To Do ===
+* detect stable code points and process segments independently in order to save memory
+* do a quick check before normalizing strings to optimize speed
+* support stream processing
+=== Contact ===
+Bug reports, feature requests, and other queries can be filed at
+the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).