File: README.md

package info (click to toggle)
libuninameslist 20200413-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 5,136 kB
  • sloc: ansic: 99,746; python: 140; makefile: 112; sh: 4
file content (249 lines) | stat: -rw-r--r-- 9,961 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
libuninameslist – A Library of Unicode names and annotation data
================================================================
[![Build Status](https://travis-ci.org/fontforge/libuninameslist.svg?branch=master)](https://travis-ci.org/fontforge/libuninameslist) [![Build status](https://ci.appveyor.com/api/projects/status/qseac73evm9leu0g?svg=true)](https://ci.appveyor.com/project/fontforge/libuninameslist) [![Coverity Scan Build Status](https://scan.coverity.com/projects/793/badge.svg?flat=1)](https://scan.coverity.com/projects/793)

- [Description](#description)
- [Installation and Build Instructions](#installation-and-build-instructions)
- [Changelog](https://raw.github.com/fontforge/libuninameslist/master/ChangeLog)
- [License](https://raw.github.com/fontforge/libuninameslist/master/LICENSE)
- [Added Python Wrapper](#added-python-wrapper)
- [See Also](#see-also)

Description
-----------

This library is updated for Nameslist.txt ver13.0 and ListeDesNoms.txt ver13.0
and includes python wrapper 'uninameslist.py'

For latest release, see: https://github.com/fontforge/libuninameslist/releases
For users that do not have autoconf or automake available, select the '-dist-'
version, which does not need you to first run autoreconf or automake

http://sourceforge.net/projects/libuninameslist/files/ is not kept up to date.

Nameslist.txt
The Unicode consortium provides [a file containing annotations on many unicode
characters.](http://www.unicode.org/Public/UNIDATA/NamesList.html) This library
contains a compiled version of this file so that programs can access this data
quickly and easily.

ListeDesNoms.txt
Is a seperate file which is translated from Nameslist.txt and was outdated for
a period of time but was recently updated by a group of developers who have
updated it up to version 13. Contributors to that file are listed in that file.

These libraries contain very large (sparse) arrays with one entry for each
unicode code point (U+0000–U+10FFFF). Each entry contains two strings, a name
and an annotation. Either or both may be NULL. Both libraries also contain a
(much smaller) list of all the Unicode blocks.
```c
struct unicode_block {
    int start, end;
    const char *name;
};

struct unicode_nameannot {
    const char *name, *annot;
};

extern const struct unicode_block UnicodeBlock[???];

#define UNICODE_NAME_MAX    ???
#define UNICODE_ANNOT_MAX   ???
extern const struct unicode_nameannot * const *const UnicodeNameAnnot[];

/* Index by: UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff] */
```

To keep both libraries slightly smaller, the beginning of lines starting with
TAB can be expanded with UTF-8 character substitutions as defined below:
```c
/* At the beginning of lines (after a tab) within the annotation string, a */
/*  * should be replaced by a bullet U+2022 */
/*  x should be replaced by a right arrow U+2192 */
/*  : should be replaced by an equivalent U+224D */
/*  # should be replaced by an approximate U+2245 */
/*  = should remain itself */
```

With the default configure option chosen, this package will install one library
file, one header file, and one library man file. The library is 'libuninameslist',
and the header is `<uninameslist.h>`. You can access these twenty-six functions:
```c
1) const char *uniNamesList_name(unsigned long uni);
2) const char *uniNamesList_annot(unsigned long uni);
3) const char *uniNamesList_NamesListVersion(void);
These functions are available in libuninameslist-0.4.20140731 and higher
4) int uniNamesList_blockCount(void);
5) int uniNamesList_blockNumber(unsigned long uni);
6) long uniNamesList_blockStart(int uniBlock);
7) long uniNamesList_blockEnd(int uniBlock);
8) const char *uniNamesList_blockName(int uniBlock);
These functions are available in libuninameslist-20180408 and higher
9) int uniNamesList_names2cnt(void);
10) long uniNamesList_names2val(int count);
11) int uniNamesList_names2getU(unsigned long uni);
12) int uniNamesList_names2lnC(int count);
13) int uniNamesList_names2lnU(unsigned long uni);
14) const char *uniNamesList_names2anC(int count);
15) const char *uniNamesList_names2anU(unsigned long uni);
These functions are available in libuninameslist-20200413 and higher
16) const char *uniNamesList_Languages(unsigned int lang);
17) const char *uniNamesList_NamesListVersionAlt(unsigned int lang);
18) const char *uniNamesList_nameAlt(unsigned long uni, unsigned int lang);
19) const char *uniNamesList_annotAlt(unsigned long uni, unsigned int lang);
20) int uniNamesList_nameBoth(unsigned long uni, unsigned int lang, const char **str0, const char **strl);
21) int uniNamesList_annotBoth(unsigned long uni, unsigned int lang, const char **str0, const char **str1);
22) int uniNamesList_blockCountAlt(unsigned int lang);
23) long uniNamesList_blockStartAlt(int uniBlock, unsigned int lang);
24) long uniNamesList_blockEndAlt(int uniBlock, unsigned int lang);
25) const char *uniNamesList_blockNameAlt(int uniBlock, unsigned int lang);
26) int uniNamesList_blockNumberBoth(unsigned long uni, unsigned int lang, int *bn0, int *bn1);
```

and for backwards compatibility for older programs that still use it, there is:
```c
UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff].name
```

while the annotation string is:
```c
UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff].annot
```

The name string is in ASCII, while the annotation string is in UTF-8 and is
also intended to be modified slightly by having any `*` characters which
immediately follow a tab at the start of a line to be converted to a bullet
character, etc.

If you choose to install the second library as well, then you will need to
use: './configure --enable-frenchlib'

This library maintains the same 'name' and 'annot' structure, but has function
names with FR so that it is possible to open both libraries at the same time.
The header file for the French library is `<uninameslist-fr.h>`

This library will also be linked to the main libuninameslist so that it can
be used through the main library (as lang=1) for functions 16 to 26.

NOTE: If you ran 'make' after running './configure' earlier, you will need to
run 'make clean' to clear-out the earlier libuninameslist library, which is
built without knowledge of the additional library.
```c
$ make clean
$ make
$ sudo make install
```
for users with smaller systems, or want slightly smaller libraries and have
strip available, you can run:
```c
$ sudo make install-strip
```

Installation and Build Instructions
-----------------------------------

Download a tagged release version from https://github.com/fontforge/libuninameslist/releases
```bash
$ wget https://github.com/fontforge/libuninameslist/archive/20200413.tar.gz
$ tar -xzf 20200413.tar.gz
$ cd libuninameslist
```

or download the latest HEAD from github:
```bash
$ git clone https://github.com/fontforge/libuninameslist.git
$ cd libuninameslist
```

Then build and install the library
```bash
$ autoreconf -i
$ automake
$ ./configure
$ make
$ sudo make install
```

If you need to also include libuninameslist-fr, you will want to use:
```bash
$ ./configure --help
$ ./configure --enable-frenchlib
$ make clean
$ make
$ sudo make install
```

NOTE: Some Distros and Operating Systems may require you to run 'ldconfig' to
recognize LibUniNamesList if you are not rebooting your computer first before
loading another program that depends on LibUniNamesList. To do this, you may
need to run 'ldconfig' in 'su -' mode after you have done 'make install':
```bash
$ su -
# ldconfig
# exit
$
```

NOTE: Users who do not have autoconf and automake available will want to
download the '-dist-' version found in the releases directory.

Added Python Wrapper
--------------------

A 'uninameslist.py' Python wrapper is provided for users that want quick
NamesList.txt access using python. To do this, you need to first build and
install the library, and then next, install the python wrapper.
```bash
$ ./configure (may need --/prefix=/usr - use --help to see options)
$ make clean
$ make
$ su
# make install
# cd py
# (May require sudo or su if you're not using a virtualenv)
# python setup.py install
# exit
$
```

The build system can optionally also build installable wheels of the package.
To do this, pass `--enable-pylib`. Optionally, also set the `PYTHON` environment
variable to configure which python to use. The configured python must have `pip`,
`setuptools` and the `wheel` packages installed.
```bash
$ autoreconf -fiv
$ PYTHON=python2 ./configure --enable-pylib
$ make
$ su
# make install
# pip install py/dist/*.whl
# exit
$
```

Note, some operating systems may need to use './configure --prefix=/usr'

The Python wrapper exposes the following library functions and symbols:

```python
 * **version**: documents the version of **libuninameslist**
 * **name(_char_)**: returns the Unicode character name
 * **name2(_char_)**: returns the Unicode normative alias if defined for correcting a character name, else just the name
 * **charactersWithName2**: string holding all characters with normative aliases
 * **annotation(_char_)**: returns all Unicode annotations including aliases and cross-references as provided by NamesList.txt
 * **block(_char_)**: returns the Unicode block a character is in, or by block name
 * **blocks()**: a generator for iterating through all defined Unicode blocks
 * **valid(_char_)**: returns whether the character is valid (defined in Unicode)
 * **uplus(_char_)**: returns the Unicode codepoint for a character in the format U+XXXX for BMP and U+XXXXXX beyond that
```

Blocks can be iterated over to yield all characters encoded in them.


See Also
--------

- [FontForge Users](https://sourceforge.net/p/fontforge/mailman/fontforge-users/) - Discussion area for users.
- [FontForge](http://github.com/fontforge/fontforge/) - font editor application that this library was made for.
- [UMap](http://umap.sf.net/) - Find unicode characters and copy them to the clipboard.