File: TODO.txt

package info (click to toggle)
libtext-unidecode-perl 1.30-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye, buster, sid, stretch
  • size: 1,248 kB
  • ctags: 6
  • sloc: perl: 3,878; makefile: 2
file content (170 lines) | stat: -rw-r--r-- 5,053 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
# -*-coding:utf-8;-*- ··················································
# Last Modified Time-stamp: "2015-10-16 03:34:44 MDT sburke@cpan.org"
#======================================================================

	      ~~  Text::Unidecode TODO file  ~~

TODO: make the table files be built from the Unicode character
database, transitioning from the values in the current table files.
I used to have a whole bunch of concise files that compiled to
the xXX.pm table files.  But they're in a file format I
haven't dealt with since 2001, so screw it all, I'm starting
from zero and it will be... well, you'll see.

(Presumably the Hangul and the Unihan blocks are outside the
scope of that whole ruckus of named characters.)


TODO GTD:
Keep plugging in suggestions from Tomaž Šolc's message


TODO:
Figure out how to courteously declare dependencies in my Makefile.PL

TODO:
Bundle with a "unidecode" util that calls a routine in Unidecode.pm?
Or maybe just show it as a one-liner in the POD?

TODO:
Also show it with an example use with iconv

TODO:
Plow through more bug reports, applying patches etc,
especially stuff from that nice Tomaž Šolc man.

TODO:

***DEFINITELY*** Make it handle stuff in Astral Plane (over U+FFFF)
Take that plunge, or ascend into the plane, or whatever.  DOIT.

TODO:

Currently, all the files are in Unix newline format (LF).
Maybe CR+LF is technically more universal, and I could switch
to that.  But so far I've gotten no complaints that I should
go do that.

TODO:

THEN re-run the thing that reads the Unihan database and generates
all the Text/Unidecode/__.pm files.
(Do the hyperspace handling first, because lots of Unihan
stuff is up there.)

TODO:

Look to see whether I need to deal with the Arabic stuff in
the U+FBxx etc blocks.  I thought it was always just
font-internal stuff, but I'm starting to suspect that it may
be encountered in the real world

TODO:

Of course:
- Check for new glyphs in existing tables.
- Look at whole new tables (like the Philippine scripts) in
normal space (x < U+FFFF)
- Look at the wild wilderness in hyperspace (x > U+FFFF


======================================================================
======================================================================

########################################################################
Below here is TODOs from the Unicode version in 2001.  Yes, that long ago.


============================== BLOCK 09 ==============================

What's an isshar? (09FA = "bengali isshar")


============================== BLOCK 0b ==============================

What's an isshar?  (0B70 = "oriya isshar")


============================== BLOCK 0e ==============================

What is 0E4C = "thai character thanthakhat" ?

What is 0E4E = "thai character yamakkan" ?


============================== BLOCK 0f ==============================

Various questions to do with Tibetan (0f00-0fff)...

A lot of these characters end up as "".  What to do with them?

How to represent these Astrological signs, 0F15-0F1F ?

What is a 0F38 = "Tibetan mark Che Mgo" ?

Should I leave "Marks and Signs" (0F82-0F87) as ""?

What to do with "Transliteration head letters" (0F88-0F8B) ?


============================== BLOCK 11 ==============================

Various Hangul components need checking:

What are chitueumsios, chitueumssangsios,
ceongchieumsios, and ceongchieumssangsios?

Is "Z" a good transliteration for pansios?

I'm using "N" for yesieung and kapyeoun both.  Is this right?

What are chitueumcieuc, chitueumssangcieuc, ceongchieumcieuc,
ceongchieumssangcieuc, chitueumchieuch, and ceongchieumchieuch?

Is "kapyeounphieuph" best transliterated as "Np" or "pN"?,
and so on for: kapyeounrieul, kapyeounmieum, kapyeounpieup,
kapyeounssangpieup, kapyeounphieuph

I'm using "Q" for yeorinhieuh, apparently an archaic glottal
stop character.  Is that right?


============================== BLOCK 14 ==============================

How to transliterate 0x1426,
AKA "canadian syllabics final double short vertical strokes"?

How to transliterate 0x1429, AKA "canadian syllabics final plus"?


============================== BLOCK 16 ==============================

Fact-check the Ogham and Runes.

What are eabhadh, or, uilleann, ifin, eamhancholl, and peith (1695-169A)?


============================== BLOCK 18 ==============================

What's 180A = "Mongolian nirugu" ?


============================== BLOCK 31 ==============================

I leave the Kaeriten (3190-319F) as null-string.  Is that good?


============================== BLOCK fb ==============================

Arabic Presentation Forms-A (FB50-FDFF) -- do I need to
do these, or are they never actually found in text files?


============================== BLOCK fe ==============================

Arabic Presentation Forms-B (FE70-FEFF) -- do I need to
do these, or are they never actually found in text files?


======================================================================
(end)