File: HOWTO-arabic.txt

package info (click to toggle)
yudit 2.5.4-2
  • links: PTS
  • area: main
  • in suites: woody
  • size: 8,528 kB
  • ctags: 8,403
  • sloc: cpp: 59,394; ansic: 2,585; perl: 2,398; makefile: 864; sh: 321
file content (80 lines) | stat: -rw-r--r-- 3,184 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
Abstract

 This document is about Yudit's shaping support, including Arabic.
 The good news is that shaping support was possible without 
 beaching Yudit's main philosophy: binary compatibility. For
 Arabic this means that if you read a document with no presentation
 forms, the same form will be saved when the document is written back
 to disk.

 There is an extra field added to glyph info, preceding the
 pre-composed/presentation field, and that can take four values: 
    isolated, initial, medial, final. 
 These values are usually in '[' .. ']' indicating that this is NOT
 the value that will be saved on disk. The presentation glyph value
 will change as shaping takes place. If this value is not in brackets
 it means that the glyph was already saved with the presentation values,
 and yudit will not calculate new presentation values, it is constant.
 Still it shows the values that should actually be used instead of
 the presentation character, to allow auto-shaping.

Example

 Lets take the example from Roman Czyborra's arabjoin program,
 character-by-character separated with ' ' spaces.

   ‮!‬ ‮لم‬ ‮ا‬ ‮ع‬ ‮ل‬ ‮ا‬ ‮لا ًب‬ ‮ه‬ ‮أ‬

 If you take out the spaces we get:

   ‮أهلا ًبالعالم!‬

 I can even reverse the game (but it does not make  any sense):

   أهلا ًبالعالم!

 If you wonder how I did it - I changed direction to '<'. Then I
 Selected the text and pressed direction button again. Or with 
 keyboard <alt>..select-with-arrow..<d>.

 In Roman's original, using presentation form:
   
   ‮ﺃﻫﻼ ًﺑﺎﻟﻌﺎﻟﻢ!‬

 or reversed 

   ﺃﻫﻼ ًﺑﺎﻟﻌﺎﻟﻢ!

 There is one more glyph, because Yudit joined U+0644 and U+0645 and
 made a presentation form of U+FC42 - for better or worse.

 Basically if there is a presentation form  Yudit combines glyphs
 and puts them into one single box. When you enter multiple glyphs 
 by hand one-by-one they will be put into a separate boxes on the
 screen. But when you re-read the file, or cut&paste they may get 
 joined and share the presentation form's box.

 This behavior is very similar in Yudit compositions. If you want to
 have combining when you enter the text from keyboard you need a
 keyboard map, that returns multiple characters.

Extra converters
 I created a "shape" and a "deshape" converter for yudit. "shape" does
 or at least is supposed to do what arabjoin did, make presentation forms.
 These glyphs are the frozen version of shaped characters. You can defreeze
  this with "deshape" converter - it converts presentation forms to their
 original form.

Further work
 You may want to:
 - add new Keyboard maps
 - stoolkit/SB_BiDi.cpp to help interface yudit with Unicode bidi. 
   (Further readings doc/Yudit-bidi.txt.)
 - Help me to test Yudit. You won't believe it, but I don't understand
   Arabic at all. I just did the programming and  I copied a few
   Arabic glyphs here....
 - Sample text and docs are needed. I don't have much time for documentation.
 - Arabic menu translations.... Please read REAME.TXT it tells you how 
   to add it.
Gaspar Sinai <gsinai@yudit.org>
2001-11-25