File: README.ja

package info (click to toggle)
ruby-uconv 0.5.3-2
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 4,432 kB
  • sloc: ansic: 160,827; ruby: 31,704; makefile: 2
file content (292 lines) | stat: -rw-r--r-- 9,986 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
               Unicode Ѵѳĥ⥸塼
                     version 0.5.3

			  

- 

ܥ⥸塼ϡISO/IEC 10646 (Unicode) ʸ ܸʸ
ȤѴΤγĥ⥸塼Ǥ

ݡȤƤ 󥳡ǥ󥰤 UCS-4, UTF-16, UTF-8,
EUC-JP, CP932 (Windows ǻȤƤ Shift_JIS Ѽ) Ǥ

ưŪ˥󥳡ǥ󥰤ǧ뵡ǽϤޤ

EUC-JP  Unicode Ȥδ֤Ѵơ֥뤬 version 0.3.x ޤ
ȤѹƤΤǤղ


- 󥹥ȡ

Ruby-1.6 ʹߤǤưǧƤޤRuby-1.6.7
ʹߤλѤ򤪤ᤷޤ

Ŭʥǥ쥯ȥ uconv ΥŸƤ

  gzip -dc < uconv-version.tar.gz | tar xvf -
  cd uconv

EUC-JPCP932 Ѵơ֥äƤ뤿ᡤʤ礭
塼ˤʤޤ⤷EUC-JP ޤ CP932 Τɤ餫
Ȥʤ extconf.rb  USE_EUC  USE_SJIS ȥ
򾮤ǤޤWindows ξ硤USE_WIN32API ꤹ
 CP932 Ѵ Win32 API ȤΤǡ⥸塼򾮤
Ǥޤ

̾Υ⥸塼륤󥹥ȡԤäƤưŪ
бƤϰʲ̤Ǥ

  ruby extconf.rb
  make
  make install


- Ȥ

Ruby  make Ū˥󥯤Ƥʤϡ

  require "uconv"

ȤƤѤƤ


- ⥸塼ؿ

  u16swap (u2swap)u4swap ʳ UTF-16, UCS-4 ʸϥ
  륨ǥǤʤƤϤʤޤ
   UCS-2 򰷤äƤؿ UTF-16 򰷤褦ѹ
  ޤ

  ؿˤäƤϤ٤Ƥ ZERO WIDTH NO-BREAK SPACE (U+FEFF) 
  BYTE ORDER MARK (BOM) ȤߤʤƺƤޤޤ


             |               Ѵ
             |  EUC-JP    CP932     UTF-8    UTF-16    UCS-4
    ---------+------------------------------------------------
       EUC-JP|  \         -         euctou8  euctou16  -
     CP932 |  -         \         sjistou8 sjistou16 -
     UTF-8 |  u8toeuc   u8tosjis  \        u8tou16   u8tou4
     UTF-16|  u16toeuc  u16tosjis u16tou8  u16swap   u16tou4
       USC-4 |  -         -         u4tou8   u4tou16   u4swap


  utf16 = Uconv.u16swap(utf16)
  ucs2 = Uconv.u2swap(ucs2)
  utf16 = Uconv.u16swap!(utf16)
  ucs2 = Uconv.u2swap!(ucs2)
    UTF-16 ʸХȥåפޤȥ륨ǥ
    ξϥӥåǥѴޤ
    ! դؿϰʸľѹޤ

  ucs4 = Uconv.u4swap(ucs4)
  ucs4 = Uconv.u4swap!(ucs4)
    UCS-4 ʸХȥåפޤ1234 ȤХȥ
     4321 Ѵޤ
    ! դؿϰʸľѹޤ

  utf16 = Uconv.u8tou16(utf8)
  ucs2 = Uconv.u8tou2(utf8)
    UTF-8 ʸ UTF-16 ʸѴޤ UTF-8
    󥹤㳰ȯޤUTF-16 ɽǤʸ
    (U-00000000  U-0010FFFF) ʳʸäƤ㳰
    ȯޤ

  utf8 = Uconv.u16tou8(utf16)
  utf8 = Uconv.u2tou8(ucs2)
    UTF-16 ʸ UTF-8 ʸѴޤǥեȤǤ 
    ZWNBSP (U+FEFF) Ϻޤʥȡڥ
    㳰ȯޤ

  utf8 = Uconv.u4tou8(ucs4)
    UCS-4 ʸ UTF-8 ʸѴޤǥեȤǤ 
    ZWNBSP (U+FEFF) Ϻޤ

  ucs4 = Uconv.u8tou4(utf8)
    UTF-8 ʸ UCS-4 ʸѴޤ UTF-8
    󥹤㳰ȯޤ

  utf16 = Uconv.u4tou16(ucs4)
    UCS-4 ʸ UTF-16 ʸѴޤUTF-16 ɽ
    Ǥʸ (U-00000000  U-0010FFFF) ʳʸä
    㳰ȯޤ

  ucs4 = Uconv.u16tou4(utf16)
    UTF-16 ʸ UCS-4 ʸѴޤʥ
    ȡڥ㳰ȯޤ

  euc  = Uconv.u16toeuc(utf16)
  euc  = Uconv.u2toeuc(ucs2)
    UTF-16 ʸ EUC-JP ʸѴޤѴǤʤʸ
     Uconv.unknown_unicode_handler ̤ξ硤'?'
    ˤʤޤ

  utf16 = Uconv.euctou16(euc)
  ucs2 = Uconv.euctou2(euc)
    EUC-JP ʸ UTF-16 ʸѴޤ

  euc  = Uconv.u8toeuc(utf8)
    UTF-8 ʸ EUC-JP ʸѴޤ
    u16toeuc(u8tou16(utf8)) ƱǤ

  utf8 = Uconv.euctou8(euc)
    EUC-JP ʸ UTF-8 ʸѴޤ
    u16tou8(euctou16(euc)) ƱǤ

  sjis  = Uconv.u16tosjis(utf16)
  sjis  = Uconv.u2tosjis(ucs2)
    UTF-16 ʸ CP932 ʸѴޤѴǤʤʸ
     Uconv.unknown_unicode_handler ̤ξ硤'?'
    ˤʤޤ

  utf16 = Uconv.sjistou16(sjis)
  ucs2 = Uconv.sjistou2(sjis)
    CP932 ʸ UTF-16 ʸѴޤ

  sjis  = Uconv.u8tosjis(utf8)
    UTF-8 ʸ CP932 ʸѴޤ
    u16tosjis(u8tou16(utf8)) ƱǤ

  utf8 = Uconv.sjistou8(sjis)
    CP932 ʸ UTF-8 ʸѴޤ
    u16tou8(sjistou16(sjis)) ƱǤ

  euc = Uconv.unknown_unicode_handler(unicode)
    ** deprecated **

    UTF-16 ޤ UTF-8  EUC-JP ޤ CP932 Ѵ
    ȤѴǤʤ UCS ʸ ä˸ƤӽФ
    ɥǤȤ Unocode ʸɤϤ
    ޤͤȤʸ֤Ʋ֤Ǥ̤
    Ǥ

  euc = Uconv.unknown_unicode_euc_handler(unicode)
    UTF-16 ޤ UTF-8  EUC-JP Ѵ
    ȤѴǤʤ UCS ʸ ä˸ƤӽФ
    ɥǤȤ Unocode ʸɤϤ
    ޤͤȤʸ֤Ʋ֤Ǥ̤
    Ǥ

  sjis = Uconv.unknown_unicode_sjis_handler(unicode)
    UTF-16 ޤ UTF-8  CP932 Ѵ
    ȤѴǤʤ UCS ʸ ä˸ƤӽФ
    ɥǤȤ Unocode ʸɤϤ
    ޤͤȤʸ֤Ʋ֤Ǥ̤
    Ǥ

  unicode = Uconv.unknown_euc_handler(euc)
    EUC-JP  UTF-16 ޤ UTF-8 ѴȤˡJIS X
    0208JIS X 0212 ̤ʸä˸ƤӽФ
    ϥɥǤȤơ1 ХȤ 3 ХȤ
    EUC-JP ʸϤޤͤȤ 31 ӥåȤ
    ֤Ƥ֤Ǥ̤Ǥ

  unicode = Uconv.unknown_sjis_handler(sjis)
    CP932  UTF-16 ޤ UTF-8 ѴȤˡCP932
    ̤ʸä˸ƤӽФϥɥǤ
    Ȥơ1 ХȤޤ 2 ХȤ CP932 ʸ
    ޤͤȤ 31 ӥåȤ֤Ƥ
    ֤Ǥ̤Ǥ

  unicode = Uconv.euc_hook(euc)

  unicode = Uconv.sjis_hook(sjis)

  euc = Uconv.unicode_euc_hook(unicode)

  sjis = Uconv.unicode_sjis_hook(unicode)

  flag = Uconv::eliminate_zwnbsp
  Uconv::eliminate_zwnbsp = flag
    ZWNBSP ʸե饰λȡѹԤޤflag  
    true  false ꤷƲͤ true Ǥ
    true ξ硤u4tou8, u16tou8 ؿ ZWNBSP ʸ
    ޤfalse ξ ZWNBSP ʸ¸ޤ

  flag = Uconv::shortest
  Uconv::shortest = flag
    ûե饰λȡѹԤޤflag  true  
    false ꤷƲͤ true Ǥtrue ξ硤
    u8to* ؿ UTF-8 ʸ󤬺ûǤʤ㳰ȯ
    ޤ

  char = Uconv::replace_invalid
  Uconv::replace_invalid(char)
    UTF-8, UTF-16, UCS-4 ʸѴȤʥХ
    󤬤äִʸ򻲾ȡꤷޤnil ꤷ
    硤ʥХѴ褦Ȥ㳰ȯ
    nil ʳꤵ줿硤ʥХ
    ˻ꤵ줿ɥݥȤʸޤͤ 
    nil Ǥ


- 

ΥСǤ Windows  Unicode եȤθߴ
θơUnicode Inc. Ѵơ֥˰äΤ
ȤäƤޤversion 0.4 Ǥ CP932 Ѵơ֥̤
ѰդΤǡEUC-JP Ѵơ֥ Unicode Inc. Ѵơ
֥ΤޤޤȤޤ

ΥСǤ WAVE DASH [U+301C]  FULL WIDTH TILDE
[U+FF5E]  EUC-JP ѴݡξȤ '' (EUC-JP:
A1C1) ˤƤޤversion 0.4 Ǥ FULL WIDTH TILDE 
̤ʸˤʤޤդ EUC-JP  ''  UCS-2 ޤ 
UTF-8 Ѵ U+FF5E ѴƤޤU+301C 
Ѵ褦ˤʤޤ

CP932 Ѵơ֥Ȥ WAVE DASH ̤ʸ
FULL WIDTH TILDE  '' (CP932: 8160) Ѵޤ

USE_WIN32API ꤷ硤UCS -> CP932 Ѵ̤ơ
ȤäȰۤʤ̤ˤʤ뤳Ȥޤ


- 

ܳĥ⥸塼ϵͤݻޤ

ܳĥ⥸塼ϡRuby ΤΥ饤󥹤ˤäѤ
뤳ȤǤޤ


- 

  <yoshidam@yoshidam.net>


- 

 Jan  3, 2010 version 0.5.3 Ruby 1.9.1
 Aug 23, 2004 version 0.5.2 pre-conversion hook for Win32
 Aug 19, 2004 version 0.5.1 u2s, s2u, shift_jis-2004
 Aug 16, 2004 version 0.5.0  pre-conversion hook, euc-jis-2004, eucjp-open
 Jul 18, 2004 version 0.4.13 ϰϥå
 Mar 12, 2003 version 0.4.12 for ruby 1.8.0
 Oct  3, 2002 version 0.4.11 --enable-compat-win32api ɲ
                             (CP932 ơ֥ Win32API ߴˤ)
 Sep  4, 2002 version 0.4.10 ꡼ν
 Feb 10, 2002 version 0.4.9 replace_invalid ɲ
 Dec 10, 2001 version 0.4.8 ֤δб
 Nov 23, 2001 version 0.4.7 non-shortest form UTF-8 å
                            Exception  Uconv::Error ѹ
 Mar  4, 2001 version 0.4.6 s2u_conv2 
                            USE_WIN32API ɲ
 Jan 30, 2001 version 0.4.5 u2s_conv2 
                            USC/CP932 Ѵơ֥ѹ
 Apr 18, 2000 version 0.4.4 SJIS  UCS ѴХ
 Mar 11, 2000 version 0.4.3 non-constant initializers κ
 Nov 23, 1999 version 0.4.2 eliminate_zwnbsp ե饰ɲ
                            ustring 饤֥ǿǤ˺ؤ
                            㴳ι®
 Nov 19, 1999 version 0.4.1 addUString ΥХեå
 Nov  5, 1999 version 0.4.0 CP932 б
 Mar 29, 1999 version 0.3.1 GC ˳ͤʤΤ xmalloc λѤ᤿
 Feb 22, 1999 version 0.3.0 UCS-4UTF-16 ݡ
 Jan 13, 1999 version 0.2.2 ݡ
 Aug 15, 1998 version 0.2.1 Ѹ README եɲ
 Jul 24, 1998 version 0.2 ѴǤʤʸäȤ˥ϥ
                          ɥƤӽФ褦ѹ
                          ˡѹ
 Jul  8, 1998 version 0.1 ꡼