1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
|
Abstract
This document is about Yudit's shaping support, including Arabic.
The good news is that shaping support was possible without
beaching Yudit's main philosophy: binary compatibility. For
Arabic this means that if you read a document with no presentation
forms, the same form will be saved when the document is written back
to disk.
There is an extra field added to glyph info, preceding the
pre-composed/presentation field, and that can take four values:
isolated, initial, medial, final.
These values are usually in '[' .. ']' indicating that this is NOT
the value that will be saved on disk. The presentation glyph value
will change as shaping takes place. If this value is not in brackets
it means that the glyph was already saved with the presentation values,
and yudit will not calculate new presentation values, it is constant.
Still it shows the values that should actually be used instead of
the presentation character, to allow auto-shaping.
Example
Lets take the example from Roman Czyborra's arabjoin program,
character-by-character separated with ' ' spaces.
! لم ا ع ل ا لا ًب ه أ
If you take out the spaces we get:
أهلا ًبالعالم!
I can even reverse the game (but it does not make any sense):
أهلا ًبالعالم!
If you wonder how I did it - I changed direction to '<'. Then I
Selected the text and pressed direction button again. Or with
keyboard <alt>..select-with-arrow..<d>.
In Roman's original, using presentation form:
ﺃﻫﻼ ًﺑﺎﻟﻌﺎﻟﻢ!
or reversed
ﺃﻫﻼ ًﺑﺎﻟﻌﺎﻟﻢ!
There is one more glyph, because Yudit joined U+0644 and U+0645 and
made a presentation form of U+FC42 - for better or worse.
Basically if there is a presentation form Yudit combines glyphs
and puts them into one single box. When you enter multiple glyphs
by hand one-by-one they will be put into a separate boxes on the
screen. But when you re-read the file, or cut&paste they may get
joined and share the presentation form's box.
This behavior is very similar in Yudit compositions. If you want to
have combining when you enter the text from keyboard you need a
keyboard map, that returns multiple characters.
Extra converters
I created a "shape" and a "deshape" converter for yudit. "shape" does
or at least is supposed to do what arabjoin did, make presentation forms.
These glyphs are the frozen version of shaped characters. You can defreeze
this with "deshape" converter - it converts presentation forms to their
original form.
Further work
You may want to:
- add new Keyboard maps
- stoolkit/SB_BiDi.cpp to help interface yudit with Unicode bidi.
(Further readings doc/Yudit-bidi.txt.)
- Help me to test Yudit. You won't believe it, but I don't understand
Arabic at all. I just did the programming and I copied a few
Arabic glyphs here....
- Sample text and docs are needed. I don't have much time for documentation.
- Arabic menu translations.... Please read REAME.TXT it tells you how
to add it.
Gaspar Sinai <gsinai@yudit.org>
2001-11-25
|