File: README

package info (click to toggle)

maildrop 2.2.0-3.1

links: PTS
area: main
in suites: squeeze
size: 14,668 kB
ctags: 4,761
sloc: ansic: 62,043; sh: 10,513; cpp: 10,062; perl: 2,807; makefile: 832

file content (49 lines) | stat: -rw-r--r-- 2,108 bytes

parent folder | download | duplicates (21)


HOW TO ADD A NEW CHARACTER SET MAPPING.

 * Create a struct unicode_info structure.  This structure defines the
 official character set name, as well as pointers to conversion functions.

 * Add the name of the character set, and the name of your structure to
 unicode/charsetlist.txt.  Multiple entries in unicode/charsetlist.txt can
 be used to define aliases for the same character set.  Example - "IBM869"
 and "CP869" both specify the same character set, they both point to the
 unicode_IBM_869 object, which is defined in ibm869.c

 There's an automatically generated source file, charsetlist.c, which is
 generated by a script from charsetlist.txt.  That's how character sets end up
 being linked into the code, and how individual character sets can be
 selectively included or excluded.

 The struct unicode_info structure contains pointers to the following
 functions:

 + Convert text in this character set to unicode.

 + Convert unicode to text in this character set.

 + Convert text in this character set to uppercase.

 + Convert text in this character set to lowercase.

 + Convert text in this character set to titlecase.

 If the character set allows for convenient conversion to
 upper/lower/titlecase, the conversion code should be coded directly. 
 Otherwise, the library has a set of convenient functions that go against
 the unicode master table.  Text in any character set can
 upper/lower/titlecased by converting it to unicode, running it through
 unicode_uc/unicode_lc/unicode_tc, then converting unicode back to the
 original character set.  See utf8_chset.c for an example.

 Note that unicode_uc/unicode_lc/unicode_tc carries a heavy penalty, and
 should be avoided.  unicode_[ult]c() adds about 26Kb of data tables.

 Finally, all this code has to be added to libunicode.a.  It can simply be
 added to libunicode_a_SOURCES.

 If, after doing all that, run make to build libunicode.a and the
 unicode-info program.  Run unicode-info.  If the character set is listed by
 unicode-info, you should be all set, provided that the conversion functions
 actually work as advertised.