From f8d8287cdbd7da9cd9392bcddf04860a10fa598e Mon Sep 17 00:00:00 2001
From: John Mark Bell <jmb@netsurf-browser.org>
Date: Mon, 10 Nov 2008 18:43:09 +0000
Subject: Import Iconv sources

svn path=/trunk/iconv/; revision=5677
---
 doc/API        | 132 +++++++++++++++++++++++++++++++++++++
 doc/ChangeLog  |  71 ++++++++++++++++++++
 doc/Uni->iconv | 205 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 408 insertions(+)
 create mode 100644 doc/API
 create mode 100644 doc/ChangeLog
 create mode 100644 doc/Uni->iconv

(limited to 'doc')

diff --git a/doc/API b/doc/API
new file mode 100644
index 0000000..13fa22f
--- /dev/null
+++ b/doc/API
@@ -0,0 +1,132 @@
+Iconv Module API
+================
+
+If using C, then you really should be using the libiconv stubs provided
+(or UnixLib, if appropriate). See the iconv.h header file for further 
+documentation of these calls.
+
+Iconv_Open (&57540)
+-------------------
+
+  Create a conversion descriptor
+
+  On Entry: r0 -> string containing name of destination encoding (eg "UTF-8")
+            r1 -> string containing name of source encoding (eg "CP1252")
+
+  On Exit:  r0 = conversion descriptor
+            All others preserved
+
+  Either encoding name may have a number of parameters appended to them.
+  Parameters are separated by a pair of forward-slashes ("//").
+  Currently defined parameters are:
+  
+	Parameter:	Destination:			Source:
+
+	TRANSLIT	Transliterate unrepresentable	None
+			output.
+
+  The conversion descriptor is an opaque value. The user should not, 
+  therefore, assume anything about its meaning, nor modify it in any way. 
+  Doing so is guaranteed to result in undefined behaviour.
+
+
+Iconv_Iconv (&57541)
+--------------------
+
+  This SWI is deprecated and Iconv_Convert should be used instead.
+
+
+Iconv_Close (&57542)
+--------------------
+
+  Destroy a conversion descriptor
+
+  On Entry: r0 = conversion descriptor to destroy
+  
+  On Exit:  r0 = 0
+            All others preserved
+
+
+Iconv_Convert (&57543)
+---------------------
+
+  Convert a byte sequence to another encoding
+  
+  On Entry: r0 = conversion descriptor returned by Iconv_Open
+            r1 -> input buffer (or NULL to reset encoding context)
+            r2 = length of buffer pointed to by r1
+            r3 -> output buffer
+            r4 = length of buffer pointed to by r3
+
+  On Exit:  r0 = number of non-reversible conversions performed (always 0)
+            r1 -> updated input buffer pointer (after last input read)
+            r2 = number of bytes remaining in input buffer
+            r3 -> updated output buffer pointer (i.e. end of output)
+            r4 = number of free bytes in the output buffer
+            All others preserved
+
+  Note that all strings should be NUL-terminated so, if calling from BASIC, 
+  some terminating character munging may be needed.
+
+
+Errors:
+
+Should an error occur, the SWI will return with V set and r0 -> error buffer.
+Note that only the error number will be filled in and may be one of:
+
+  ICONV_NOMEM (&81b900)
+  ICONV_INVAL (&81b901)
+  ICONV_2BIG  (&81b902)
+  ICONV_ILSEQ (&81b903)
+
+These map directly to the corresponding C errno values.
+
+
+Iconv_CreateMenu (&57544)
+-------------------------
+
+  Create a menu data structure containing all available encodings.
+
+  On Entry: r0 = flags. All bits reserved, must be 0
+            r1 -> buffer, or 0 to read required length
+            r2 = length of buffer in r1
+            r3 -> currently selected encoding name, or 0 if none selected
+            r4 -> buffer for indirected data, or 0 to read length
+            r5 = length of buffer in r4
+  
+  On Exit:  r2 = required size of buffer in r1 if r1 = 0 on entry,
+                 or length of data placed in buffer
+            r5 = required size of buffer in r4 if r4 = 0 on entry,
+                 or length of data placed in buffer
+
+  Menu titles are direct form text buffers. Menu entries are indirect text.
+  Entry text is stored in the buffer pointed to by R4 on entry to this call.
+
+
+Iconv_DecodeMenu (&57545)
+-------------------------
+
+  Decode a selection in a menu generated by Iconv_CreateMenu.
+  Places the corresponding encoding name in the result buffer.
+  
+  On Entry: r0 = flags. All bits reserved, must be 0
+            r1 -> menu definition
+            r2 -> menu selections, as per Wimp_Poll
+            r3 -> buffer for result or 0 to read required length
+            r4 = buffer length
+  
+  On Exit:  r4 = required size of buffer if r3 = 0 on entry,
+                 or length of data placed in buffer (0 if no selected 
+                 encoding)
+
+  The menu selections block pointed to by r2 on entry should be based at
+  the root of the encodings menu structure (i.e. index 0 in the block
+  should correspond to the selection in the main encoding menu).
+  
+  This call will update the selection status of the menu(s) appropriately.
+
+
+Example Code:
+=============
+
+Example code may be found in the IconvEg BASIC file.
diff --git a/doc/ChangeLog b/doc/ChangeLog
new file mode 100644
index 0000000..96f5924
--- /dev/null
+++ b/doc/ChangeLog
@@ -0,0 +1,71 @@
+Iconv Changelog
+===============
+
+0.01 10-Sep-2004
+----------------
+
+  - Initial version - unreleased.
+
+0.02 27-Sep-2004
+----------------
+
+  - Use allocated SWI & error chunks.
+  - Fix issues in 8bit encoding handling.
+  - First public release.
+
+0.03 22-Jan-2005
+----------------
+
+  - Add Iconv_Convert SWI with improved interface.
+  - Deprecate Iconv_Iconv SWI.
+  - Add encoding name alias handling.
+  - Bundle !Unicode resource.
+
+0.04 08-Apr-2005
+----------------
+
+  - Improve parameter checking.
+  - Fix potential memory leaks.
+  - Add encoding menu creation and selection handling.
+
+0.05 27-Jun-2005
+----------------
+
+  - Improve encoding alias support, using external data file.
+  - Add StubsG build for A9home users.
+
+0.06 05-Nov-2005
+----------------
+
+  - Modified menu creation API to store indirected text in a
+    user-provided buffer. This change is backwards incompatible.
+
+0.07 11-Feb-2006
+----------------
+
+  - Corrected output values for E2BIG errors.
+  - Fixed input pointer update after successful conversion.
+
+0.08 11-Mar-2007
+----------------
+
+  - Tightened up parameter checking in various places.
+  - Improve aliases hash function.
+  - Make 8bit write function's return values match encoding_write
+    with encoding_WRITE_STRICT set.
+  - Fix bug in 8bit writing which resulted in the remaining buffer
+    size being reduced even if nothing was written.
+  - Improve support for endian-specific Unicode variants.
+  - Work around issue in UnicodeLib where remaining buffer size is
+    reduced if an attempt is made to write an unrepresentable character.
+  - Add rudimentary //TRANSLIT support - simply replaces with '?' for now.
+  - Make UnicodeLib handle raw ISO-8859-{1,2,9,10,15} and not attempt
+    ISO-6937-2-25 shift sequences.
+  - Remove StubsG build as A9home now has a C99 capable C library.
+  - Overhaul documentation.
+
+0.09 XX-XX-2008
+---------------
+
+  - Restructured source tree into cross-platform and RO-specific parts
+  - 
diff --git a/doc/Uni->iconv b/doc/Uni->iconv
new file mode 100644
index 0000000..f10b6c7
--- /dev/null
+++ b/doc/Uni->iconv
@@ -0,0 +1,205 @@
+Introduction:
+=============
+
+This file documents an approximate correlation between the data files
+provided in the !Unicode distribution and the encoding headers in GNU
+libiconv 1.9.1.
+
+Those with '?' in the iconv column either are not represented in iconv
+or I've missed the relevant header file ;)
+
+A number of encodings are present in the iconv distribution but not
+in !Unicode. These are documented at the end of this file.
+
+Changelog:
+==========
+
+v 0.01 (09-Sep-2004)
+~~~~~~~~~~~~~~~~~~~~
+Initial Incarnation
+
+v 0.02 (11-Sep-2004)
+~~~~~~~~~~~~~~~~~~~~
+Documented additional encodings supported by the Iconv module.
+Corrected list of !Unicode deficiencies.
+
+
+!Unicode->iconv:
+================
+
+Unicode:			iconv:			notes:
+
+Acorn.Latin1			riscos1.h
+
+Apple.CentEuro			mac_centraleurope.h
+Apple.Cyrillic			mac_cyrillic.h
+Apple.Roman			mac_roman.h
+Apple.Ukrainian			mac_ukraine.h
+
+BigFive				big5.h
+
+ISO2022.C0.40[ISO646]		?
+
+ISO2022.C1.43[IS6429]		?
+
+ISO2022.G94.40[646old]		iso646_cn.h
+ISO2022.G94.41[646-GB]		?
+ISO2022.G94.42[646IRV]		?
+ISO2022.G94.43[FinSwe]		?
+ISO2022.G94.47[646-SE]		?
+ISO2022.G94.48[646-SE]		?
+ISO2022.G94.49[JS201K]		jisx0201.h		top of JIS range 
+ISO2022.G94.4A[JS201R]		jisx0201.h iso646_jp.h	bottom of JIS range
+ISO2022.G94.4B[646-DE]		?
+ISO2022.G94.4C[646-PT]		?
+ISO2022.G94.54[GB1988]		?
+ISO2022.G94.56[Teltxt]		?
+ISO2022.G94.59[646-IT]		?
+ISO2022.G94.5A[646-ES]		?
+ISO2022.G94.60[646-NO]		?
+ISO2022.G94.66[646-FR]		?
+ISO2022.G94.69[646-HU]		?
+ISO2022.G94.6B[Arabic]		?
+ISO2022.G94.6C[IS6397]		?
+ISO2022.G94.7A[SerbCr]		?
+
+ISO2022.G94x94.40[JS6226]	?
+ISO2022.G94x94.41[GB2312]	gb2312.h
+ISO2022.G94x94.42[JIS208]	jis0x208.h
+ISO2022.G94x94.43[KS1001]	ksc5601.h
+ISO2022.G94x94.44[JIS212]	jis0x212.h
+ISO2022.G94x94.47[CNS1]		cns11643_1.h		the tables differ
+ISO2022.G94x94.48[CNS2]		cns11643_2.h
+ISO2022.G94x94.49[CNS3]		cns11643_3.h
+ISO2022.G94x94.4A[CNS4]		cns11643_4.h
+ISO2022.G94x94.4B[CNS5]		cns11643_5.h
+ISO2022.G94x94.4C[CNS6]		cns11643_6.h
+ISO2022.G94x94.4D[CNS7]		cns11643_7.h
+
+ISO2022.G96.41[Lat1]		iso8859_1.h
+ISO2022.G96.42[Lat2]		iso8859_2.h
+ISO2022.G96.43[Lat3]		iso8859_3.h
+ISO2022.G96.44[Lat4]		iso8859_4.h
+ISO2022.G96.46[Greek]		?
+ISO2022.G96.47[Arabic]		iso8859_6.h		ISO-8859-6 ignored
+ISO2022.G96.48[Hebrew]		?
+ISO2022.G96.4C[Cyrill]		?
+ISO2022.G96.4D[Lat5]		iso8859_5.h
+ISO2022.G96.50[LatSup]		?
+ISO2022.G96.52[IS6397]		?
+ISO2022.G96.54[Thai]		tis620.h
+ISO2022.G96.56[Lat6]		iso8859_6.h
+ISO2022.G96.58[L6Sami]		?
+ISO2022.G96.59[Lat7]		iso8859_7.h
+ISO2022.G96.5C[Welsh]		?
+ISO2022.G96.5D[Sami]		?
+ISO2022.G96.5E[Hebrew]		?
+ISO2022.G96.5F[Lat8]		iso8859_8.h
+ISO2022.G96.62[Lat9]		iso8859_9.h
+
+KOI8-R				koi8_r.h
+
+Microsoft.CP1250		cp1250.h
+Microsoft.CP1251		cp1251.h
+Microsoft.CP1252		cp1252.h
+Microsoft.CP1254		cp1254.h
+Microsoft.CP866			cp866.h
+Microsoft.CP932			cp932.h cp932ext.h
+
+iconv->!Unicode:
+================
+
+Iconv has the following encodings, which are not present in !Unicode. 
+Providing a suitable data file for !Unicode is trivial. Whether UnicodeLib
+will then act upon the addition of these is unknown.
+This list is ordered as per libiconv's NOTES file.
+
+European & Semitic languages:
+
+	ISO-8859-16 (iso8859_16.h)
+	KOI8-{U,RU,T} (koi8_xx.h)
+	CP125{3,5,6,7} (cp125n.h)
+	CP850 (cp850.h)
+	CP862 (cp862.h)
+	Mac{Croatian,Romania,Greek,Turkish,Hebrew,Arabic} (mac_foo.h)
+
+Japanese:
+
+	None afaikt.
+
+Simplified Chinese:
+
+	GB18030 (gb18030.h, gb18030ext.h)
+	HZ-GB-2312 (hz.h)
+
+Traditional Chinese:
+
+	CP950 (cp950.h)
+	BIG5-HKSCS (big5hkscs.h)
+
+Korean:
+
+	CP949 (cp949.h)
+
+Armenian:
+
+	ARMSCII-8 (armscii_8.h)
+
+Georgian:
+
+	Georgian-Academy, Georgian-PS (georgian_academy.h, georgian_ps.h)
+
+Thai:
+
+	CP874 (cp874.h)
+	MacThai (mac_thai.h)
+
+Laotian:
+
+	MuleLao-1, CP1133 (mulelao.h, cp1133.h)
+
+Vietnamese:
+
+	VISCII, TCVN (viscii.h, tcvn.h)
+	CP1258 (cp1258.h)
+
+Unicode:
+
+	BE/LE variants of normal encodings. I assume UnicodeLib handles
+	these, but can't be sure.
+	C99 / JAVA - well, yes.
+
+
+Iconv Module:
+=============
+
+The iconv module is effectively a thin veneer around UnicodeLib. However,
+8bit encodings are implemented within the module rather than using the
+support in UnicodeLib. The rationale for this is simply that, although
+UnicodeLib will understand (and act upon - reportedly...) additions to
+the ISO2022 Unicode resource, other encodings are ignored. As the vast
+majority of outstanding encodings fall into this category, and the code
+is fairly simple, it made sense to implement it within the module.
+
+With use of the iconv module, the list of outstanding encodings is
+reduced to:
+
+	ISO-8859-16 (easily implemented, if required)
+	CP1255 (requires state-based transcoding)
+
+	GB18030 (not 8bit - reportedly a requirement of PRC)
+	HZ-GB-2312 (not 8bit - supported by IE4)
+
+	CP950 (not 8bit - a (MS) variant of Big5)
+	BIG5-HKSCS (not 8bit - again, a Big5 variant)
+
+	CP949 (not 8bit)
+
+	ARMSCII-8 (easily implemented, if required)
+
+	VISCII (easily implemented, if required)
+	CP1258, TCVN (requires state-based transcoding)
+
+Additionally, the rest of the CodePage encodings implemented in iconv
+but not listed above (due to omissions from the iconv documentation)
+are implemented by the iconv module.
\ No newline at end of file
-- 
cgit v1.2.3