summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README123
1 files changed, 10 insertions, 113 deletions
diff --git a/README b/README
index e72ffff..ac68401 100644
--- a/README
+++ b/README
@@ -1,116 +1,13 @@
+libutf8proc
+===========
-Please read the LICENSE file, which is shipping with this software.
-
-
-*** QUICK START ***
-
-For compilation of the C library call "make c-library", for compilation of
-the ruby library call "make ruby-library" and for compilation of the
-PostgreSQL extension call "make pgsql-library".
-
-For ruby you can also create a gem-file by calling "make ruby-gem".
-
-"make all" can be used to build everything, but both ruby and PostgreSQL
-installations are required in this case.
-
-
-*** GENERAL INFORMATION ***
-
-The C library is found in this directory after successful compilation and
-is named "libutf8proc.a" and "libutf8proc.so". The ruby library consists of
-the files "utf8proc.rb" and "utf8proc_native.so", which are found in the
-subdirectory "ruby/". If you chose to create a gem-file it is placed in the
-"ruby/gem" directory. The PostgreSQL extension is named "utf8proc_pgsql.so"
-and resides in the "pgsql/" directory.
-
-Both the ruby library and the PostgreSQL extension are built as stand-alone
-libraries and are therefore not dependent the dynamic version of the
-C library files, but this behaviour might change in future releases.
-
-The Unicode version being supported is 5.0.0.
-Note: Version 4.1.0 of Unicode Standard Annex #29 was used, as
- version 5.0.0 had not been available at the time of implementation.
-
-For Unicode normalizations, the following options have to be used:
-Normalization Form C: STABLE, COMPOSE
-Normalization Form D: STABLE, DECOMPOSE
-Normalization Form KC: STABLE, COMPOSE, COMPAT
-Normalization Form KD: STABLE, DECOMPOSE, COMPAT
-
-
-*** C LIBRARY ***
-
-The documentation for the C library is found in the utf8proc.h header file.
-"utf8proc_map" is most likely function you will be using for mapping UTF-8
-strings, unless you want to allocate memory yourself.
-
-
-*** RUBY API ***
-
-The ruby library adds the methods "utf8map" and "utf8map!" to the String
-class, and the method "utf8" to the Integer class.
-
-The String#utf8map method does the same as the "utf8proc_map" C function.
-Options for the mapping procedure are passed as symbols, i.e:
-"Hello".utf8map(:casefold) => "hello"
-
-The descriptions of all options are found in the C header file
-"utf8proc.h". Please notice that the according symbols in ruby are all
-lowercase.
-
-String#utf8map! is the destructive function in the meaning that the string
-is replaced by the result.
-
-There are shortcuts for the 4 normalization forms specified by Unicode:
-String#utf8nfd, String#utf8nfd!,
-String#utf8nfc, String#utf8nfc!,
-String#utf8nfkd, String#utf8nfkd!,
-String#utf8nfkc, String#utf8nfkc!
-
-The method Integer#utf8 returns a UTF-8 string, which is containing the
-unicode char given by the code point.
-0x000A.utf8 => "\n"
-0x2028.utf8 => "\342\200\250"
-
-
-*** POSTGRESQL API ***
-
-For PostgreSQL there are two SQL functions supplied named "unifold" and
-"unistrip". These functions function can be used to prepare index fields in
-order to be folded in a way where string-comparisons make more sense, e.g.
-where "bathtub" == "bath<soft hyphen>tub"
-or "Hello World" == "hello world".
-
-CREATE TABLE people (
- id serial8 primary key,
- name text,
- CHECK (unifold(name) NOTNULL)
-);
-CREATE INDEX name_idx ON people (unifold(name));
-SELECT * FROM people WHERE unifold(name) = unifold('John Doe');
-
-The function "unistrip" removes character marks like accents or diaeresis,
-while "unifold" keeps then.
-
-NOTICE: The outputs of the function can change between releases, as
- utf8proc does not follow a versioning stability policy. You have to
- rebuild your database indicies, if you upgrade to a newer version
- of utf8proc.
-
-
-*** TODO ***
-
-- detect stable code points and process segments independently in order to
- save memory
-- do a quick check before normalizing strings to optimize speed
-- support stream processing
-
-
-*** CONTACT ***
-
-If you find any bugs or experience difficulties in compiling this software,
-please contact us:
-
-Project page: http://www.public-software-group.org/utf8proc
+This is the Public software group utf8proc library [1] repackaged as a
+conveniance library for NetSurf. Previously this library was simply
+copied into the NetSurf sources.
+This takes the unicode 5 capable version 1.1.6 of the library and
+converts it to the NetSurf build system. No C source code has been
+changed from upstream and all the Makefiles are licenced as per the
+utf8proc source.
+[1] http://www.public-software-group.org/utf8proc \ No newline at end of file