1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
|
libuninameslist – A Library of Unicode names and annotation data
================================================================
[](https://travis-ci.org/fontforge/libuninameslist) [](https://ci.appveyor.com/project/fontforge/libuninameslist) [](https://scan.coverity.com/projects/793)
- [Description](#description)
- [Installation and Build Instructions](#installation-and-build-instructions)
- [Changelog](https://raw.github.com/fontforge/libuninameslist/master/ChangeLog)
- [License](https://raw.github.com/fontforge/libuninameslist/master/LICENSE)
- [Added Python Wrapper](#added-python-wrapper)
- [See Also](#see-also)
Description
-----------
This library is updated for Nameslist.txt ver13.0 and ListeDesNoms.txt ver13.0
and includes python wrapper 'uninameslist.py'
For latest release, see: https://github.com/fontforge/libuninameslist/releases
For users that do not have autoconf or automake available, select the '-dist-'
version, which does not need you to first run autoreconf or automake
http://sourceforge.net/projects/libuninameslist/files/ is not kept up to date.
Nameslist.txt
The Unicode consortium provides [a file containing annotations on many unicode
characters.](http://www.unicode.org/Public/UNIDATA/NamesList.html) This library
contains a compiled version of this file so that programs can access this data
quickly and easily.
ListeDesNoms.txt
Is a seperate file which is translated from Nameslist.txt and was outdated for
a period of time but was recently updated by a group of developers who have
updated it up to version 13. Contributors to that file are listed in that file.
These libraries contain very large (sparse) arrays with one entry for each
unicode code point (U+0000–U+10FFFF). Each entry contains two strings, a name
and an annotation. Either or both may be NULL. Both libraries also contain a
(much smaller) list of all the Unicode blocks.
```c
struct unicode_block {
int start, end;
const char *name;
};
struct unicode_nameannot {
const char *name, *annot;
};
extern const struct unicode_block UnicodeBlock[???];
#define UNICODE_NAME_MAX ???
#define UNICODE_ANNOT_MAX ???
extern const struct unicode_nameannot * const *const UnicodeNameAnnot[];
/* Index by: UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff] */
```
To keep both libraries slightly smaller, the beginning of lines starting with
TAB can be expanded with UTF-8 character substitutions as defined below:
```c
/* At the beginning of lines (after a tab) within the annotation string, a */
/* * should be replaced by a bullet U+2022 */
/* x should be replaced by a right arrow U+2192 */
/* : should be replaced by an equivalent U+224D */
/* # should be replaced by an approximate U+2245 */
/* = should remain itself */
```
With the default configure option chosen, this package will install one library
file, one header file, and one library man file. The library is 'libuninameslist',
and the header is `<uninameslist.h>`. You can access these twenty-six functions:
```c
1) const char *uniNamesList_name(unsigned long uni);
2) const char *uniNamesList_annot(unsigned long uni);
3) const char *uniNamesList_NamesListVersion(void);
These functions are available in libuninameslist-0.4.20140731 and higher
4) int uniNamesList_blockCount(void);
5) int uniNamesList_blockNumber(unsigned long uni);
6) long uniNamesList_blockStart(int uniBlock);
7) long uniNamesList_blockEnd(int uniBlock);
8) const char *uniNamesList_blockName(int uniBlock);
These functions are available in libuninameslist-20180408 and higher
9) int uniNamesList_names2cnt(void);
10) long uniNamesList_names2val(int count);
11) int uniNamesList_names2getU(unsigned long uni);
12) int uniNamesList_names2lnC(int count);
13) int uniNamesList_names2lnU(unsigned long uni);
14) const char *uniNamesList_names2anC(int count);
15) const char *uniNamesList_names2anU(unsigned long uni);
These functions are available in libuninameslist-20200413 and higher
16) const char *uniNamesList_Languages(unsigned int lang);
17) const char *uniNamesList_NamesListVersionAlt(unsigned int lang);
18) const char *uniNamesList_nameAlt(unsigned long uni, unsigned int lang);
19) const char *uniNamesList_annotAlt(unsigned long uni, unsigned int lang);
20) int uniNamesList_nameBoth(unsigned long uni, unsigned int lang, const char **str0, const char **strl);
21) int uniNamesList_annotBoth(unsigned long uni, unsigned int lang, const char **str0, const char **str1);
22) int uniNamesList_blockCountAlt(unsigned int lang);
23) long uniNamesList_blockStartAlt(int uniBlock, unsigned int lang);
24) long uniNamesList_blockEndAlt(int uniBlock, unsigned int lang);
25) const char *uniNamesList_blockNameAlt(int uniBlock, unsigned int lang);
26) int uniNamesList_blockNumberBoth(unsigned long uni, unsigned int lang, int *bn0, int *bn1);
```
and for backwards compatibility for older programs that still use it, there is:
```c
UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff].name
```
while the annotation string is:
```c
UnicodeNameAnnot[(uni>>16)&0x1f][(uni>>8)&0xff][uni&0xff].annot
```
The name string is in ASCII, while the annotation string is in UTF-8 and is
also intended to be modified slightly by having any `*` characters which
immediately follow a tab at the start of a line to be converted to a bullet
character, etc.
If you choose to install the second library as well, then you will need to
use: './configure --enable-frenchlib'
This library maintains the same 'name' and 'annot' structure, but has function
names with FR so that it is possible to open both libraries at the same time.
The header file for the French library is `<uninameslist-fr.h>`
This library will also be linked to the main libuninameslist so that it can
be used through the main library (as lang=1) for functions 16 to 26.
NOTE: If you ran 'make' after running './configure' earlier, you will need to
run 'make clean' to clear-out the earlier libuninameslist library, which is
built without knowledge of the additional library.
```c
$ make clean
$ make
$ sudo make install
```
for users with smaller systems, or want slightly smaller libraries and have
strip available, you can run:
```c
$ sudo make install-strip
```
Installation and Build Instructions
-----------------------------------
Download a tagged release version from https://github.com/fontforge/libuninameslist/releases
```bash
$ wget https://github.com/fontforge/libuninameslist/archive/20200413.tar.gz
$ tar -xzf 20200413.tar.gz
$ cd libuninameslist
```
or download the latest HEAD from github:
```bash
$ git clone https://github.com/fontforge/libuninameslist.git
$ cd libuninameslist
```
Then build and install the library
```bash
$ autoreconf -i
$ automake
$ ./configure
$ make
$ sudo make install
```
If you need to also include libuninameslist-fr, you will want to use:
```bash
$ ./configure --help
$ ./configure --enable-frenchlib
$ make clean
$ make
$ sudo make install
```
NOTE: Some Distros and Operating Systems may require you to run 'ldconfig' to
recognize LibUniNamesList if you are not rebooting your computer first before
loading another program that depends on LibUniNamesList. To do this, you may
need to run 'ldconfig' in 'su -' mode after you have done 'make install':
```bash
$ su -
# ldconfig
# exit
$
```
NOTE: Users who do not have autoconf and automake available will want to
download the '-dist-' version found in the releases directory.
Added Python Wrapper
--------------------
A 'uninameslist.py' Python wrapper is provided for users that want quick
NamesList.txt access using python. To do this, you need to first build and
install the library, and then next, install the python wrapper.
```bash
$ ./configure (may need --/prefix=/usr - use --help to see options)
$ make clean
$ make
$ su
# make install
# cd py
# (May require sudo or su if you're not using a virtualenv)
# python setup.py install
# exit
$
```
The build system can optionally also build installable wheels of the package.
To do this, pass `--enable-pylib`. Optionally, also set the `PYTHON` environment
variable to configure which python to use. The configured python must have `pip`,
`setuptools` and the `wheel` packages installed.
```bash
$ autoreconf -fiv
$ PYTHON=python2 ./configure --enable-pylib
$ make
$ su
# make install
# pip install py/dist/*.whl
# exit
$
```
Note, some operating systems may need to use './configure --prefix=/usr'
The Python wrapper exposes the following library functions and symbols:
```python
* **version**: documents the version of **libuninameslist**
* **name(_char_)**: returns the Unicode character name
* **name2(_char_)**: returns the Unicode normative alias if defined for correcting a character name, else just the name
* **charactersWithName2**: string holding all characters with normative aliases
* **annotation(_char_)**: returns all Unicode annotations including aliases and cross-references as provided by NamesList.txt
* **block(_char_)**: returns the Unicode block a character is in, or by block name
* **blocks()**: a generator for iterating through all defined Unicode blocks
* **valid(_char_)**: returns whether the character is valid (defined in Unicode)
* **uplus(_char_)**: returns the Unicode codepoint for a character in the format U+XXXX for BMP and U+XXXXXX beyond that
```
Blocks can be iterated over to yield all characters encoded in them.
See Also
--------
- [FontForge Users](https://sourceforge.net/p/fontforge/mailman/fontforge-users/) - Discussion area for users.
- [FontForge](http://github.com/fontforge/fontforge/) - font editor application that this library was made for.
- [UMap](http://umap.sf.net/) - Find unicode characters and copy them to the clipboard.
|