1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174
|
#!/usr/bin/perl
# Generate Arabic kmap for Yudit from shape.pl
#
@full_date = localtime(time);
$year = $full_date[5] + 1900;
$mon = $full_date[4] + 1;
$mday = $full_date[3];
$date = sprintf ("%4d-%02d-%02d", $year, $mon, $mday);
print <<EOD;
// This kmap file was generated from
// The Experimental Arabic.kmap
// 1998-06-18 Roman Czyborra\@cs.tu-berlin.de
// By Gaspar Sinai's perl script $0 at $date.
//
EOD
#
# First the data.
#
%NAMES=();
%KEYS=();
while (<DATA>)
{
chomp;
@_ = split ('~', $_);
$h = $_[1];
$h =~ s/0x(.*)/$1/go;
$h = hex ($h);
$k = sprintf ("%08X", $h);
print "\"$_[0]=$_[1]\", // " . &toUtf8 ($h) . " $_[2]\n";
$NAMES{$k} = $_[2];
$KEYS{$k} = $_[0];
}
print <<EOD;
//
// Shaping part using shape.mys. Autogenerated.
//
EOD
while (<>)
{
chomp;
if (/(.*) -> [0-9A-F]+ [0-9A-F]+ [0-9A-F]+ [0-9A-F]+/)
{
$first = $1;
@fspl = split(' ', $first);
if ($#fspl < 0)
{
print STDERR "CAN not split $first \n";
next;
}
$isdefined=1;
$allkey = "\"";
$alleq = "=";
$allcomm = "";
$allutf = "";
$space="";
# We care about ligatures only.
next if ($#fspl <= 0);
for (@fspl)
{
if (!defined ($KEYS{$_}))
{
# print STDERR "$_ in @fspl is not defined...\n";
$isdefined=0;
last;
}
$defin = $NAMES{$_};
$defin =~ s/ARABIC LETTER //;
$allkey .= $space; $allkey .= $KEYS{$_};
$allcomm .= $space; $allcomm .= $defin;
$allutf .= &toUtf8 (hex($_));
$alleq .= $space; $alleq .= sprintf ("0x%04x", hex ($_));
$space=" ";
}
if ($isdefined == 1)
{
print $allkey . $alleq . "\", " . "// " . $allutf . " " . $allcomm . "\n";
}
}
}
print <<EOD;
//
// End of shaping part. Autogenerated.
//
EOD
exit (0);
sub toUtf8
{
$str = $_[0];
if ($_[0] >= 0x800) {
$str = chr (0xe0 | ($_[0] >> 12));
$str .= chr (0x80 | (($_[0] >> 6) & 0x3f));
$str .= chr (0x80 | ($_[0] & 0x3f));
} elsif ($_[0] >= 0x80) {
$str = chr (0xc0 | ($_[0] >> 6));
$str .= chr (0x80 | ($_[0] & 0x3f));
} else {
$str .= chr ($str);
}
$str;
}
__END__
$~0x064C~ARABIC DAMMATAN~
%~0x064F~ARABIC DAMMA~
&~0x0651~ARABIC SHADDA~
'~0x064E~ARABIC FATHA~
*~0x0652~ARABIC SUKUN~
,~0x060C~ARABIC COMMA~
-~0x0640~ARABIC TATWEEL~
0x30~0x0660~ARABIC-INDIC DIGIT ZERO~
0x31~0x0661~ARABIC-INDIC DIGIT ONE~
0x32~0x0662~ARABIC-INDIC DIGIT TWO~
0x33~0x0663~ARABIC-INDIC DIGIT THREE~
0x34~0x0664~ARABIC-INDIC DIGIT FOUR~
0x35~0x0665~ARABIC-INDIC DIGIT FIVE~
0x36~0x0666~ARABIC-INDIC DIGIT SIX~
0x37~0x0667~ARABIC-INDIC DIGIT SEVEN~
0x38~0x0668~ARABIC-INDIC DIGIT EIGHT~
0x39~0x0669~ARABIC-INDIC DIGIT NINE~
;~0x061B~ARABIC SEMICOLON~
?~0x061F~ARABIC QUESTION MARK~
@~0x0621~ARABIC LETTER HAMZA~
A~0x0670~ARABIC LETTER SUPERSCRIPT ALEF~
^~0x064B~ARABIC FATHATAN~
_~0x064D~ARABIC KASRATAN~
`~0x0650~ARABIC KASRA~
a~0x0627~ARABIC LETTER ALEF~
aB~0x0625~ARABIC LETTER ALEF WITH HAMZA BELOW~
aH~0x0623~ARABIC LETTER ALEF WITH HAMZA ABOVE~
aM~0x0622~ARABIC LETTER ALEF WITH MADDA ABOVE~
b~0x0628~ARABIC LETTER BEH~
c~0x0635~ARABIC LETTER SAD~
d~0x062F~ARABIC LETTER DAL~
dD~0x0636~ARABIC LETTER DAD~
dK~0x0630~ARABIC LETTER THAL~
e~0x0639~ARABIC LETTER AIN~
f~0x0641~ARABIC LETTER FEH~
g~0x062C~ARABIC LETTER JEEM~
gF~0x06AF~ARABIC LETTER GAF~
h~0x0647~ARABIC LETTER HEH~
hH~0x0681~ARABIC LETTER HAH WITH HAMZA ABOVE~
hK~0x062D~ARABIC LETTER HAH~
i~0x063A~ARABIC LETTER GHAIN~
j~0x0649~ARABIC LETTER ALEF MAKSURA~
k~0x0643~ARABIC LETTER KAF~
l~0x0644~ARABIC LETTER LAM~
m~0x0645~ARABIC LETTER MEEM~
n~0x0646~ARABIC LETTER NOON~
p~0x067E~ARABIC LETTER PEH~
q~0x0642~ARABIC LETTER QAF~
r~0x0631~ARABIC LETTER REH~
s~0x0633~ARABIC LETTER SEEN~
S~0x0634~ARABIC LETTER SHEEN~
t~0x062A~ARABIC LETTER TEH~
tC~0x0686~ARABIC LETTER TCHEH~
tJ~0x0637~ARABIC LETTER TAH~
tK~0x062B~ARABIC LETTER THEH~
tM~0x0629~ARABIC LETTER TEH MARBUTA~
v~0x06A4~ARABIC LETTER VEH~
w~0x0648~ARABIC LETTER WAW~
wH~0x0624~ARABIC LETTER WAW WITH HAMZA ABOVE~
x~0x062E~ARABIC LETTER KHAH~
y~0x064A~ARABIC LETTER YEH~
yH~0x0626~ARABIC LETTER YEH WITH HAMZA ABOVE~
z~0x0632~ARABIC LETTER ZAIN~
zH~0x0638~ARABIC LETTER ZAH~
zJ~0x0698~ARABIC LETTER JEH~
|