summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.gitignore10
-rw-r--r--LICENSE.md (renamed from LICENSE)41
-rw-r--r--Makefile41
-rw-r--r--README63
-rw-r--r--README.md68
5 files changed, 116 insertions, 107 deletions
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..99c1931
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,10 @@
+*.tar.gz
+*.exe
+*.dll
+*.do
+*.o
+*.so
+*.a
+*.dll
+*.dylib
+*.dSYM
diff --git a/LICENSE b/LICENSE.md
index 8b61bd6..c45f425 100644
--- a/LICENSE
+++ b/LICENSE.md
@@ -1,5 +1,13 @@
+== libutf8proc license ==
-Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany
+**libutf8proc** is a lightly updated version of the **utf8proc**
+library by Jan Behrens and the rest of the Public Software Group, who
+deserve nearly all of the credit for this library. Like utf8proc,
+whose copyright and license statements are reproduced below, all new
+work on the libutf8proc library is licensed under the [MIT "expat"
+license](http://opensource.org/licenses/MIT):
+
+*Copyright © 2014 by Steven G. Johnson.*
Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the "Software"),
@@ -19,14 +27,37 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
+== Original utf8proc license ==
+
+*Copyright (c) 2009, 2013 Public Software Group e. V., Berlin, Germany*
+
+Permission is hereby granted, free of charge, to any person obtaining a
+copy of this software and associated documentation files (the "Software"),
+to deal in the Software without restriction, including without limitation
+the rights to use, copy, modify, merge, publish, distribute, sublicense,
+and/or sell copies of the Software, and to permit persons to whom the
+Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
+
+== Unicode data license ==
This software distribution contains derived data from a modified version of
the Unicode data files. The following license applies to that data:
-COPYRIGHT AND PERMISSION NOTICE
+**COPYRIGHT AND PERMISSION NOTICE**
-Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
-under the Terms of Use in http://www.unicode.org/copyright.html.
+*Copyright (c) 1991-2007 Unicode, Inc. All rights reserved. Distributed
+under the Terms of Use in http://www.unicode.org/copyright.html.*
Permission is hereby granted, free of charge, to any person obtaining a
copy of the Unicode data files and any associated documentation (the "Data
@@ -57,8 +88,6 @@ not be used in advertising or otherwise to promote the sale, use or other
dealings in these Data Files or Software without prior written
authorization of the copyright holder.
-
Unicode and the Unicode logo are trademarks of Unicode, Inc., and may be
registered in some jurisdictions. All other trademarks and registered
trademarks mentioned herein are the property of their respective owners.
-
diff --git a/Makefile b/Makefile
index b697d0b..735dc2f 100644
--- a/Makefile
+++ b/Makefile
@@ -9,20 +9,12 @@ cc = $(CC) $(cflags)
# meta targets
-c-library: libutf8proc.a libutf8proc.so
-
-ruby-library: ruby/utf8proc_native.so
-
-pgsql-library: pgsql/utf8proc_pgsql.so
+all: c-library
-all: c-library ruby-library ruby-gem pgsql-library
+c-library: libutf8proc.a libutf8proc.so
-clean::
+clean:
rm -f utf8proc.o libutf8proc.a libutf8proc.so
- cd ruby/ && test -e Makefile && (make clean && rm -f Makefile) || true
- rm -Rf ruby/gem/lib ruby/gem/ext
- rm -f ruby/gem/utf8proc-*.gem
- cd pgsql/ && make clean
# real targets
@@ -39,30 +31,3 @@ libutf8proc.so: utf8proc.o
libutf8proc.dylib: utf8proc.o
$(cc) -dynamiclib -o $@ $^ -install_name $(libdir)/$@
-
-ruby/Makefile: ruby/extconf.rb
- cd ruby && ruby extconf.rb
-
-ruby/utf8proc_native.so: utf8proc.h utf8proc.c utf8proc_data.c \
- ruby/utf8proc_native.c ruby/Makefile
- cd ruby && make
-
-ruby/gem/lib/utf8proc.rb: ruby/utf8proc.rb
- test -e ruby/gem/lib || mkdir ruby/gem/lib
- cp ruby/utf8proc.rb ruby/gem/lib/
-
-ruby/gem/ext/extconf.rb: ruby/extconf.rb
- test -e ruby/gem/ext || mkdir ruby/gem/ext
- cp ruby/extconf.rb ruby/gem/ext/
-
-ruby/gem/ext/utf8proc_native.c: utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c
- test -e ruby/gem/ext || mkdir ruby/gem/ext
- cat utf8proc.h utf8proc_data.c utf8proc.c ruby/utf8proc_native.c | grep -v '#include "utf8proc.h"' | grep -v '#include "utf8proc_data.c"' | grep -v '#include "../utf8proc.c"' > ruby/gem/ext/utf8proc_native.c
-
-ruby-gem:: ruby/gem/lib/utf8proc.rb ruby/gem/ext/extconf.rb ruby/gem/ext/utf8proc_native.c
- cd ruby/gem && gem build utf8proc.gemspec
-
-pgsql/utf8proc_pgsql.so: utf8proc.h utf8proc.c utf8proc_data.c \
- pgsql/utf8proc_pgsql.c
- cd pgsql && make
-
diff --git a/README b/README
deleted file mode 100644
index 692b61b..0000000
--- a/README
+++ /dev/null
@@ -1,63 +0,0 @@
-
-Please read the LICENSE file, which is shipping with this software.
-
-
-*** QUICK START ***
-
-For compilation of the C library call "make c-library", for compilation of
-the ruby library call "make ruby-library" and for compilation of the
-PostgreSQL extension call "make pgsql-library".
-
-For ruby you can also create a gem-file by calling "make ruby-gem".
-
-"make all" can be used to build everything, but both ruby and PostgreSQL
-installations are required in this case.
-
-
-*** GENERAL INFORMATION ***
-
-The C library is found in this directory after successful compilation and
-is named "libutf8proc.a" and "libutf8proc.so". The ruby library consists of
-the files "utf8proc.rb" and "utf8proc_native.so", which are found in the
-subdirectory "ruby/". If you chose to create a gem-file it is placed in the
-"ruby/gem" directory. The PostgreSQL extension is named "utf8proc_pgsql.so"
-and resides in the "pgsql/" directory.
-
-Both the ruby library and the PostgreSQL extension are built as stand-alone
-libraries and are therefore not dependent the dynamic version of the
-C library files, but this behaviour might change in future releases.
-
-The Unicode version being supported is 5.0.0.
-Note: Version 4.1.0 of Unicode Standard Annex #29 was used, as
- version 5.0.0 had not been available at the time of implementation.
-
-For Unicode normalizations, the following options have to be used:
-Normalization Form C: STABLE, COMPOSE
-Normalization Form D: STABLE, DECOMPOSE
-Normalization Form KC: STABLE, COMPOSE, COMPAT
-Normalization Form KD: STABLE, DECOMPOSE, COMPAT
-
-
-*** C LIBRARY ***
-
-The documentation for the C library is found in the utf8proc.h header file.
-"utf8proc_map" is most likely function you will be using for mapping UTF-8
-strings, unless you want to allocate memory yourself.
-
-
-*** TODO ***
-
-- detect stable code points and process segments independently in order to
- save memory
-- do a quick check before normalizing strings to optimize speed
-- support stream processing
-
-
-*** CONTACT ***
-
-If you find any bugs or experience difficulties in compiling this software,
-please contact us:
-
-Project page: http://www.public-software-group.org/utf8proc
-
-
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..e0efefc
--- /dev/null
+++ b/README.md
@@ -0,0 +1,68 @@
+== libutf8proc ==
+
+The [libutf8proc package](https://github.com/JuliaLang/libutf8proc) is
+a lightly updated fork of the [utf8proc
+library](http://www.public-software-group.org/utf8proc) from Jan
+Behrens and the rest of the [Public Software
+Group](http://www.public-software-group.org/), who deserve *nearly all
+of the credit* for this package: a small, clean C library that
+provides Unicode normalization, case-folding, and other operations for
+data in the [UTF-8 encoding](http://en.wikipedia.org/wiki/UTF-8).
+
+The reason for this fork is that utf8proc is used for basic Unicode
+support in the [Julia language](http://julialang.org/) and the Julia
+developers wanted Unicode 7 support and other features, but the
+Public Software Group currently does not seem to have the resources
+necessary to update utf8proc. We hope that the fork can be merged
+back into the mainline utf8proc package before too long.
+
+(The original utf8proc package also includes Ruby and PostgreSQL plug-ins.
+We removed those from libutf8proc in order to focus exclusively on the C
+library for the time being. We will strive to keep API changes to a minimum,
+so libutf8proc should still be usable with the old plug-in code.)
+
+Like utf8proc, the libutf8proc package is licensed under the
+free/open-source [MIT "expat"
+license](http://opensource.org/licenses/MIT) (plus certain Unicode
+data governed by the similarly permissive [Unicode data
+license](http://www.unicode.org/copyright.html#Exhibit1)); please see
+the included `LICENSE.md` file for more detailed information.
+
+=== Quick Start ===
+
+For compilation of the C library run `make`.
+
+=== General Information ===
+
+The C library is found in this directory after successful compilation
+and is named `libutf8proc.a` (for the static library) and
+`libutf8proc.so` (for the dynamic library).
+
+The Unicode version being supported is 5.0.0.
+*Note:* Version 4.1.0 of Unicode Standard Annex #29 was used, as
+version 5.0.0 had not been available at the time of implementation.
+
+For Unicode normalizations, the following options are used:
+
+* Normalization Form C: `STABLE`, COMPOSE`
+* Normalization Form D: `STABLE`, `DECOMPOSE`
+* Normalization Form KC: `STABLE`, `COMPOSE`, `COMPAT`
+* Normalization Form KD: `STABLE`, `DECOMPOSE`, `COMPAT`
+
+=== C Library ===
+
+The documentation for the C library is found in the `utf8proc.h` header file.
+`utf8proc_map` is function you will most likely be using for mapping UTF-8
+strings, unless you want to allocate memory yourself.
+
+=== To Do ===
+
+* detect stable code points and process segments independently in order to save memory
+* do a quick check before normalizing strings to optimize speed
+* support stream processing
+
+=== Contact ===
+
+Bug reports, feature requests, and other queries can be filed at
+the [libutf8proc page on Github](https://github.com/JuliaLang/libutf8proc).
+