File: README

package info (click to toggle)
maildrop 2.2.0-3.1
  • links: PTS
  • area: main
  • in suites: squeeze
  • size: 14,668 kB
  • ctags: 4,761
  • sloc: ansic: 62,043; sh: 10,513; cpp: 10,062; perl: 2,807; makefile: 832
file content (49 lines) | stat: -rw-r--r-- 2,108 bytes parent folder | download | duplicates (21)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

HOW TO ADD A NEW CHARACTER SET MAPPING.

 * Create a struct unicode_info structure.  This structure defines the
 official character set name, as well as pointers to conversion functions.

 * Add the name of the character set, and the name of your structure to
 unicode/charsetlist.txt.  Multiple entries in unicode/charsetlist.txt can
 be used to define aliases for the same character set.  Example - "IBM869"
 and "CP869" both specify the same character set, they both point to the
 unicode_IBM_869 object, which is defined in ibm869.c

 There's an automatically generated source file, charsetlist.c, which is
 generated by a script from charsetlist.txt.  That's how character sets end up
 being linked into the code, and how individual character sets can be
 selectively included or excluded.

 The struct unicode_info structure contains pointers to the following
 functions:

 + Convert text in this character set to unicode.

 + Convert unicode to text in this character set.

 + Convert text in this character set to uppercase.

 + Convert text in this character set to lowercase.

 + Convert text in this character set to titlecase.

 If the character set allows for convenient conversion to
 upper/lower/titlecase, the conversion code should be coded directly. 
 Otherwise, the library has a set of convenient functions that go against
 the unicode master table.  Text in any character set can
 upper/lower/titlecased by converting it to unicode, running it through
 unicode_uc/unicode_lc/unicode_tc, then converting unicode back to the
 original character set.  See utf8_chset.c for an example.

 Note that unicode_uc/unicode_lc/unicode_tc carries a heavy penalty, and
 should be avoided.  unicode_[ult]c() adds about 26Kb of data tables.

 Finally, all this code has to be added to libunicode.a.  It can simply be
 added to libunicode_a_SOURCES.

 If, after doing all that, run make to build libunicode.a and the
 unicode-info program.  Run unicode-info.  If the character set is listed by
 unicode-info, you should be all set, provided that the conversion functions
 actually work as advertised.