File: unicode.7

package info (click to toggle)
manpages-ja 0.5.0.0.20100315-1
  • links: PTS
  • area: main
  • in suites: squeeze
  • size: 21,156 kB
  • ctags: 1
  • sloc: sh: 13,935; perl: 157; makefile: 114
file content (314 lines) | stat: -rw-r--r-- 10,705 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
.\" Hey Emacs! This file is -*- nroff -*- source.
.\"
.\" Copyright (C) Markus Kuhn, 1995, 2001
.\"
.\" This is free documentation; you can redistribute it and/or
.\" modify it under the terms of the GNU General Public License as
.\" published by the Free Software Foundation; either version 2 of
.\" the License, or (at your option) any later version.
.\"
.\" The GNU General Public License's references to "object code"
.\" and "executables" are to be interpreted as the output of any
.\" document formatting or typesetting system, including
.\" intermediate and printed output.
.\"
.\" This manual is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public
.\" License along with this manual; if not, write to the Free
.\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
.\" USA.
.\"
.\" 1995-11-26  Markus Kuhn <mskuhn@cip.informatik.uni-erlangen.de>
.\"      First version written
.\" 2001-05-11  Markus Kuhn <mgk25@cl.cam.ac.uk>
.\"      Update
.\"
.\" Japanese Version Copyright (c) 1997 HANATAKA Shinya
.\"         all rights reserved.
.\" Translated Thu Jun  3 20:36:31 JST 1997
.\"         by HANATAKA Shinya <hanataka@abyss.rim.or.jp>
.\" Updated & Modified Sat Jun 23 07:30:09 JST 2001
.\"         by Yuichi SATO <ysato@h4.dion.ne.jp>
.\"
.\"WORD:	
.\"WORD:	diacritical mark	ȯ
.\"WORD:	International Phonetic Alphabet		ݲ
.\"WORD:	
.\"
.TH UNICODE 7 2001-05-11 "GNU" "Linux Programmer's Manual"
.SH ̾
Unicode \- ʸ
.SH 
ݵ
.B ISO 10646

.B "ʸ (Universal Character Set (UCS))"
Ƥ롣
UCS ¾ʤʸʸƴޤǤ롣
ˡ
.B "θߴ (round-trip compatibility)"
ݾڤ롣
㤨¾椫 UCS Ѵ˸ѴȤƤ⡢
ξ⼺ʤʤ褦Ѵơ֥뤳ȤǤ롣

UCS ϸŪΤƤƤθɽΤɬפʸޤǤ롣
ˤϥƥʸꥷʸʸإ֥饤ʸӥʸ
˥ʸ른ʸǤʤܡڹǻȤƤ
ˤϡʿ̾Ҳ̾ϥ󥰥ʸ
ǡʡ꡼ʸ٥󥬥ʸ७ʸ顼ʸ
䡼ʸߡʸƥ륰ʸʥʸޥ䡼ʸ
ʸ饪ʸ᡼ʸܥݥեʸ ()
٥åʸ롼ʸԥʸʥʸ
ʸ󥴥ʸ
ʸߥޡʸϥʸ
ʸ (׳) ʸʤɤޤޤ롣
ޤСƤʤʸդƤ⡢
ԥ塼ǻѤ뤿
ɤΤ褦ʥ󥳡ɤäȤɤȤ椬ʤƤꡢ
ǽŪˤɲä
ҥդŪʤʥɡ衼åѸǤʤ
ƥ󥰥ʸʸ󥴥ʸʤɤο͹ŪʸФƤ롣
UCS ϡʸ˲äơTeX, PostScript, APL, MS-DOS, MS-Windows,
Macintosh, OCR եȡ¿Υɥץå
ǥƥࡢʤɤ󶡤
޷桦桦ص桦ʳصʤɤ¿ޤ褦ˤʤä

UCS  (ISO 10646) 
.I "31ӥåȤʸ祢ƥ㡼"
򵭽ҤƤꡢ128 Ĥ 24 ӥå
.IR  " (" group )
鹽Ƥ롣
Ʒ 256 Ĥ 16 ӥå
.IR  " (" plane )
ʬ䤵Ƥꡢʸ 256 Ĥ 8 ӥå
.IR  " (" row )
 256
.IR  " (" column )
˰֤롣
εʤ Part 1
.RB ( "ISO 10646-1" )
Ǥϡǽ 65534 ĤΥɰ (0x0000  0xfffd) Ƥ롣
 0  0 ̤Ǥ
.IR "¿ (Basic Multilingual Plane (BMP))"
εʤ Part 2
.RB ( "ISO 10646-2" )
Ǥϡ 0  BMP γǤ
0x10000  0x10ffff ϰϤˤ
.I ""
ʸɲä
εʤǤ 0x10ffff ֤ۤʸɲäͽϤʤΤǡ
ͽۤǤ뾭ˤƤϡ
ɶ֤Τ롼 0 ΰʬϼºݤˤϻȤ뤳ȤϤʤ
BMP ˤ¾ʸǰ̤˻ȤƤʸޤޤƤ롣
ISO 10646-2 ɲä줿̤ϡ
βʳʬǡȡ⼡ץȥ롦
Υեδ֤ʤɤǻȤüʸ򥫥С롣
.PP
UCS ʸ 2 ХȤΥɤɽΤ
.B UCS-2
Ǥ (BMP ʸΤ)
ޤ
.B UCS-4
Ǥʸ 4 ХȤΥɤɽ롣
ˡASCII 륽եȥؤβ̸ߴΤ
.B UTF-8
󥳡ɷ롣
ޤ0x10ffff ޤǤ BMP ʸ򰷤
UCS-2 бեȥȤθߴΤ
.B UTF-16
󥳡ɷ롣
.PP
UCS ʸ 0x0000  0x007f ϡŵŪ
.B US-ASCII
ʸʸƱǤ롣
ޤ 0x0000  0x00ff ϰϤǤϡ
.B ISO 8859-1 Latin-1
ʸʸƱǤ롣
.SS "ʸ (Combining Characters)"
.B UCS
ΤĤΥɡݥȤ
.I "ʸ (combining characters)"
˳ƤƤ롣
ϥץ饤ΰưʤȡ˻Ƥ롣
ʸľʸ˥ȤΤߤä롣
ǤפʥդʸϤ켫ȤΥɤ UCS ˻äƤ롣
ǹʸƤʸ˥Ȥȯä뤳ȤǤ롣
ʸϾˤ줬ʸ³
㤨Хɥĸʸ A 饦 ("Latin capital letter A with diaeresis") 
UCS äƽ줿 0x00c4 Ǥ⡢
̾ A "Latin capital letter A" 
"combining diaeresis (ʬ)" ³ȹ礻
(0x0041 0x0308) ΤɤǤɽ뤳ȤǤ롣
.PP
ʸϡʸؿΥ󥳡ɡ
ݲȤ桼ʤɤˤɬܤǤ롣
.SS ٥
ƤΥƥ˹ʸΤ褦ʿʤݡȤԤƤ櫓ǤϤʤ
ISO 10646-1 ϰʲλʳ UCS μ٥ꤷƤ롣
.TP 0.9i
Level 1
ʸ
.B ϥ󥰥롦ʸ
(ʴڹīʸ沽
沽Ǥϡϥ󥰥벻Υդ
3 Ĥޤ 2 Ĥ첻ҲɤȤ߹碌沽) ϥݡȤʤ
.TP
Level 2
Level 1 ƱͤʸɬܤȤΤʸ
(㤨Сʸ饪ʸإ֥饤ʸӥʸ
ǡʡ꡼ʸޥ䡼ʸʤ) ϻȤ롣
.TP
Level 3
Ƥ
.B UCS
ʸ򥵥ݡȤ롣
.PP
.B ˥ɡ󥽡 (Unicode Consortium)
ȯԤ줿
.B Unicode 3.0 Standard
ϡISO 10646-1:2000 ˵Ҥ줿
.B UCS Basic Multilingual Plane
 level 3 ƱǤ롣
.B Unicode 3.1
Ǥ ISO 10646-2 ̤ɲäƤ롣
Unicode Consortium ȯԤ Unicode ʤȵѥݡȤˤꡢ
ʸΰ̣ȿ侩ˡˤĤƤιʤ롣
εʽ䵻ѥݡȤǡUnicode ʸ
Խ¤ؤӡѴɽ뤿
ɥ饤ȥ르ꥺबʬ롣
.SS "Linux ˤ Unicode"
GNU/Linux ǤϡC η
.B wchar_t
դ 32 ӥåǤ롣
ͤ C 饤֥ˤ (٤ƤΥˤ) 
.B UCS
ɤͤȤƲᤵ롣
 GNU C 饤֥꤬ץꥱΤ餻뤿εȤơ

.B __STDC_ISO_10646__
 ISO C99 ʤǻꤵƤ롣

ASCII ߴ
.B UTF-8
ޥХȥ󥳡ɤǤϡϥȥ꡼ࡦü̿
ץ졼ƥȥե롦ե̾Ķѿˤơ
UCS/Unicode  ASCII Τ褦˻ȤȤǤ롣
UTF-8 ʸ󥳡ɤȤƻȤȤ
ƤΥץꥱΤ餻뤿ˤϡ
("LANG=en_GB.UTF-8" Τ褦) ĶѿȤäŬڤ
.I  (locale)
򤷤ʤФʤʤ
.PP
.B nl_langinfo(CODESET)
ؿ򤵤줿󥳡ɤ֤̾
Ū
.I wchar_t
ʸʸ򥷥ƥʸ󥨥󥳡ɤѴ (Ѵ) Τ˻Ȥ
.BR wctomb (3)

.BR mbsrtowcs (3)
ˤ
.BR wcwidth (3)
Ȥä饤֥ؿϡ
ʸϤǤɤ뤬ʤ (0\(en2) ֤
.PP
Ū˸ȡLinux ǤϸߤΤȤ
BMP  level 1 ΤߤȤ٤Ǥ롣
ʸ (Ȥ˥ʸ) Ǥϡ
١ʸ 2 ĤޤǤιʸȤȤ
UTF-8 üߥ졼 ISO 10646 ե (level 2) ǥݡȤƤ롣
Ū˸С⤷ǽʤФ餫ʸȤ٤Ǥ
(Unicode Ǥϡ
.B "Normalization Form C (ʸ)"
Ȥ)
.SS ץ饤١ȡꥢ
.B BMP
 0xe000  0xf8ff ϰϤϡʤǤϤʤʸƤ
ŪʻѤΤͽ󤵤Ƥ롣
Linux ߥ˥ƥǤϡ
Υץ饤١ȡꥢ򤵤˺٤ʬ䤷ƻѤ롣
0xe000  0xefff ϰϤϥɡ桼ġ˻Ѥ뤳ȤǤ롣
0xf000  0xf8ff ϰϤ Linux Zone 
Ƥ Linux 桼Ƕ̤˻Ѥ롣
Linux Zone ؤʸƤϿϡ
 H. Peter Anvin <Peter.Anvin@linux.org> ˤäƴƤ롣
.SS ʸ
.TP 0.2i
*
Information technology \(em Universal Multiple-Octet Coded Character
Set (UCS) \(em Part 1: Architecture and Basic Multilingual Plane.
International Standard ISO/IEC 10646-1, International Organization
for Standardization, Geneva, 2000.


.B UCS
θʻͤǤ롣
http://www.iso.ch/ ʸǤ CD-ROM  PDF եȤǤ롣
.TP
*
The Unicode Standard, Version 3.0.
The Unicode Consortium, Addison-Wesley,
Reading, MA, 2000, ISBN 0-201-61633-5.
.TP
*
S. Harbison, G. Steele. C: A Reference Manual. Fourth edition,
Prentice Hall, Englewood Cliffs, 1995, ISBN 0-13-326224-3.

C ץˤĤƤΤȤƤɤͽǤ롣
ǤǤϡ磻ʸޥХʸ󥳡ɤ򰷤
¿ο C 饤֥ؿ
ä줿 ISO C90 ʤ 1994 Amendment 1 򥫥СƤ롣
磻ʸޥХʸΥݡȤ
˲ ISO C99 ϡޤСƤʤ
.TP
*
Unicode ѥݡȡ
.RS
http://www.unicode.org/unicode/reports/
.RE
.TP
*
Markus Kuhn: Unix/Linux Τ UTF-8  Unicode  FAQ
.RS
http://www.cl.cam.ac.uk/~mgk25/unicode.html

.I linux-utf8
᡼󥰥ꥹȤɤ뤿ξ󤬤롣
Linux  Unicode ȤΥɥХõΤ˰ɤǤ롣
.RE
.TP
*
Bruno Haible: Unicode HOWTO.
.RS
ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO.html
.RE
.SH Х
Υޥ˥奢롦ڡǸ˲ǡ
GNU C 饤֥
.B UTF-8
ݡȤϴƤ롣
XFree86 ˤ륵ݡȤϿʹǤ롣
.B UTF-8
DzŬ˻Ȥ륢ץꥱ
(¿ͭ̾ʥǥ) κϡޤʹǤ롣
Linux Ǥ
.B UCS
ݡȤǤ̾ CJK  2 磻ʸ󶡤롣
ñʽŤǤˤʸ󶡤⤢롣
麸ؽʸإ֥饤ʸӥʸɸʸʤɤ
֤ɬפȤʸϥݡȤƤʤ
ߡʸ줿ƥ襨󥸥
GUI ץꥱ (HTML ӥ塼ɥץå) ǤΤ
ݡȤƤ롣
.\" .SH 
.\" Markus Kuhn <mgk25@cl.cam.ac.uk>
.SH Ϣ
.BR setlocale (3),
.BR charsets (7),
.BR utf-8 (7)