1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162
|
#
# Gaspar Sinai Tokyo 1999-12-03
#
Binary Map Formats
Sn = n long 8 bit unsigned char string
U1 = 8 bit unsigned integer
U2 = 16 bit unsigned integer network orderccessing:
LOW = INPUT%256
HIGH = INPUT/256
returning 0 means no mappping. It also means
that 0 can not be mapped to or from.
so is if (LOW < LOW_MIN || LOW > LOW_MAX
|| HIGH < HIGH_MIN || HIGH > HIGH_MAX)
U4 = 32 bit unsigned integer network order
I4 = 32 bit signed integer network order
ARRAY16 = 16 bit unsigned integer network order compressed array
The size of the array:
(LOW_MAX - LOW_MIN + 1) * (HIGH_MAX = HIGH_MIN+1)
Where:
LOW_MIN = min (code[n]%256)
LOW_MAX = max (code[n]%256)
HIGH_MIN = min (code[n]/256)
HIGH_MAX = max (code[n]/256)
Accessing:
LOW = INPUT%256
HIGH = INPUT/256
returning 0 means no mappping. It also means
that 0 can not be mapped to or from.
so is if (LOW < LOW_MIN || LOW > LOW_MAX
|| HIGH < HIGH_MIN || HIGH > HIGH_MAX)
if there is mapping it can be
ARRAY[(HIGH - HIGH_MIN) * (LOW_MAX - LOW_MIN+1) + (LOW-LOW_MIN)]
1 to 1 bmap
===========
This umap allows 1 to 1 mapping and reverse mapping of unsigned shorts.
S16: "YUDIT-UMAP 1.0"
S32: "alias name"
U2: SOFFSET start of data offset from beginning of file
#
# FREE AREA till SOFFSET
#
# Bounds of Local Code Array that maps to Unicode
# Start of data SOFFSET
U2: decode HIGH_MIN
U2: decode HIGH_MAX
U2: decode LOW_MIN
U2: decode LOW_MAX
# Bounds of Unicode Array that maps to Local Code
U2: encode HIGH_MIN
U2: encode HIGH_MAX
U2: encode LOW_MIN
U2: encode LOW_MAX
ARRAY16: decode
ARRAY16: encode
<---end--->
n to n bmap
===========
This maps max 255 byte long string to a max 256 byte string.
S16: "YUDIT-NtoN 1.0"
S32: "alias name"
U4 COMMENT_SIZE
U1 COMMENT[COMMENT_SIZE]
U4: MAP_TYPE -
0. undefined
1. kmap
2. fontmap
3. clustered kmap
U4: MAP_SIZE - the number of maps in this coder
U4: OFFSET[0] - the offsets pointing to CODE_AREAs
... note that we have one more...
U4: OFFSET[MAP_SIZE] - the offsets pointing to END
#
# Start of CODE AREA. array rererences start here.
#
S32: "alias name"
U4 COMMENT_SIZE
U1 COMMENT[COMMENT_SIZE]
U1 DECODE - bit 0 is 0 if decode, 1 if encode (reverse = from unicode) map
U1 INPUT_BYTE_SIZE - This many bytes supposed to form an input word (hint)
U1 OUTPUT_BYTE_SIZE- This many bytes supposed to form an output word (hint)
U1 INPUT_BYTE_LENGTH - The size of the length indicator in data. 0,1,2 or 3
U1 OUTPUT_BYTE_LENGTH - The size of the length indicator in data 0,1,2 or 3
0=8bit, 1=16bit, 2=32bit, 3=64bit
U4 STATE_MACHINE - Index to state machine if zero there is no state machine.
State machine should come after code area.
U4 SPARE - UNUSED. 0.
U4 CODE_SIZE - The size of the struct map.
U4 CODE_MAP[0] - points to the first element starting from DATA_AREA
.. note that we have 1 more element in this array!!!
U4 CODE_MAP[CODE_SIZE] - points to the end of last element
#
# DATA_AREA array references start here.
#
unpadded struct {
Ui KEY_SIZE
Ui SUB_SIZE # The size of elements macthed.
U1 [KEY_SIZE] KEY
Uo RESULT_SIZE
U1 [RESULT_SIZE] RESULT
U1 COMMENT SIZE # A max 255 byte comment.
U1 [COMMENT_SIZE] COMMENT
} [CODE_SIZE]
#
# State Machine (optional)
# The integers point to next state inside this state machine.
# If -1, reject.
# Currently ther is no implementation for this yet.
# STATE_MACHINE this can be added here or collectively at the end of
# this file.
U4[32] size - state machine size in 64 byte words. (states)
U4[16] state0
U4[16] state1
[..]
* Each state contains a nibble. (FB -> F one state, B another state.)
* Each state is has an index of 30 bit integers pointing to next state
or the matched value.
* They can point to matched value. by having the
upper 2 bits value:
REJECT: 0 - points to nowhere
MORE: 1 - points to STATE_MACHINE
MATCH: 3 - points to CODE_MAP
* The match more is not used.
#
# FREE AREA till OFFSET[1]
#
|