File: charset.html

package info (click to toggle)
pgadmin3 1.4.3-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 29,796 kB
  • ctags: 10,758
  • sloc: cpp: 55,356; sh: 6,164; ansic: 1,520; makefile: 576; sql: 482; xml: 100; perl: 18
file content (269 lines) | stat: -rw-r--r-- 13,874 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Chapter21.Localization</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="admin.html" title="PartIII.Server Administration">
<link rel="prev" href="client-authentication-problems.html" title="20.3.Authentication problems">
<link rel="next" href="multibyte.html" title="21.2.Character Set Support">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="chapter" lang="en" id="charset">
<div class="titlepage"><div><div><h2 class="title">
<a name="charset"></a>Chapter21.Localization</h2></div></div></div>
<div class="toc">
<p><b>Table of Contents</b></p>
<dl>
<dt><span class="sect1"><a href="charset.html#locale">21.1. Locale Support</a></span></dt>
<dd><dl>
<dt><span class="sect2"><a href="charset.html#id663047">21.1.1. Overview</a></span></dt>
<dt><span class="sect2"><a href="charset.html#id663399">21.1.2. Behavior</a></span></dt>
<dt><span class="sect2"><a href="charset.html#id663578">21.1.3. Problems</a></span></dt>
</dl></dd>
<dt><span class="sect1"><a href="multibyte.html">21.2. Character Set Support</a></span></dt>
<dd><dl>
<dt><span class="sect2"><a href="multibyte.html#multibyte-charset-supported">21.2.1. Supported Character Sets</a></span></dt>
<dt><span class="sect2"><a href="multibyte.html#id664688">21.2.2. Setting the Character Set</a></span></dt>
<dt><span class="sect2"><a href="multibyte.html#id664905">21.2.3. Automatic Character Set Conversion Between Server and Client</a></span></dt>
<dt><span class="sect2"><a href="multibyte.html#id666163">21.2.4. Further Reading</a></span></dt>
</dl></dd>
</dl>
</div>
<p>  This chapter describes the available localization features from the
  point of view of the administrator.
  <span class="productname">PostgreSQL</span> supports localization with
  two approaches:

   </p>
<div class="itemizedlist"><ul type="disc">
<li><p>      Using the locale features of the operating system to provide
      locale-specific collation order, number formatting, translated
      messages, and other aspects.
     </p></li>
<li><p>      Providing a number of different character sets defined in the
      <span class="productname">PostgreSQL</span> server, including
      multiple-byte character sets, to support storing text in all
      kinds of languages, and providing character set translation between
      client and server.
     </p></li>
</ul></div>
<p>
  </p>
<div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="locale"></a>21.1.Locale Support</h2></div></div></div>
<a name="id663015"></a><p>   <em class="firstterm">Locale</em> support refers to an application respecting
   cultural preferences regarding alphabets, sorting, number
   formatting, etc.  <span class="productname">PostgreSQL</span> uses the standard ISO
   C and <acronym class="acronym">POSIX</acronym> locale facilities provided by the server operating
   system.  For additional information refer to the documentation of your
   system.
  </p>
<div class="sect2" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="id663047"></a>21.1.1.Overview</h3></div></div></div>
<p>    Locale support is automatically initialized when a database
    cluster is created using <code class="command">initdb</code>.
    <code class="command">initdb</code> will initialize the database cluster
    with the locale setting of its execution environment by default,
    so if your system is already set to use the locale that you want
    in your database cluster then there is nothing else you need to
    do.  If you want to use a different locale (or you are not sure
    which locale your system is set to), you can instruct
    <code class="command">initdb</code> exactly which locale to use by
    specifying the <code class="option">--locale</code> option. For example:
</p>
<pre class="screen">initdb --locale=sv_SE</pre>
<p>
   </p>
<p>    This example sets the locale to Swedish (<code class="literal">sv</code>) as spoken
    in Sweden (<code class="literal">SE</code>).  Other possibilities might be
    <code class="literal">en_US</code> (U.S. English) and <code class="literal">fr_CA</code> (French
    Canadian).  If more than one character set can be useful for a
    locale then the specifications look like this:
    <code class="literal">cs_CZ.ISO8859-2</code>. What locales are available under what
    names on your system depends on what was provided by the operating
    system vendor and what was installed.  (On most systems, the command
    <code class="literal">locale -a</code> will provide a list of available locales.)
   </p>
<p>    Occasionally it is useful to mix rules from several locales, e.g.,
    use English collation rules but Spanish messages.  To support that, a
    set of locale subcategories exist that control only a certain
    aspect of the localization rules:

    </p>
<div class="informaltable"><table border="1">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><code class="envar">LC_COLLATE</code></td>
<td>String sort order</td>
</tr>
<tr>
<td><code class="envar">LC_CTYPE</code></td>
<td>Character classification (What is a letter? Its upper-case equivalent?)</td>
</tr>
<tr>
<td><code class="envar">LC_MESSAGES</code></td>
<td>Language of messages</td>
</tr>
<tr>
<td><code class="envar">LC_MONETARY</code></td>
<td>Formatting of currency amounts</td>
</tr>
<tr>
<td><code class="envar">LC_NUMERIC</code></td>
<td>Formatting of numbers</td>
</tr>
<tr>
<td><code class="envar">LC_TIME</code></td>
<td>Formatting of dates and times</td>
</tr>
</tbody>
</table></div>
<p>

    The category names translate into names of
    <code class="command">initdb</code> options to override the locale choice
    for a specific category.  For instance, to set the locale to
    French Canadian, but use U.S. rules for formatting currency, use
    <code class="literal">initdb --locale=fr_CA --lc-monetary=en_US</code>.
   </p>
<p>    If you want the system to behave as if it had no locale support,
    use the special locale <code class="literal">C</code> or <code class="literal">POSIX</code>.
   </p>
<p>    The nature of some locale categories is that their value has to be
    fixed for the lifetime of a database cluster.  That is, once
    <code class="command">initdb</code> has run, you cannot change them anymore.
    <code class="literal">LC_COLLATE</code> and <code class="literal">LC_CTYPE</code> are
    those categories.  They affect the sort order of indexes, so they
    must be kept fixed, or indexes on text columns will become corrupt.
    <span class="productname">PostgreSQL</span> enforces this by recording
    the values of <code class="envar">LC_COLLATE</code> and <code class="envar">LC_CTYPE</code> that are
    seen by <code class="command">initdb</code>.  The server automatically adopts
    those two values when it is started.
   </p>
<p>    The other locale categories can be changed as desired whenever the
    server is running by setting the run-time configuration variables
    that have the same name as the locale categories (see <a href="runtime-config-client.html#runtime-config-client-format" title="17.10.2.Locale and Formatting">Section17.10.2, &#8220;Locale and Formatting&#8221;</a> for details).  The defaults that are
    chosen by <code class="command">initdb</code> are actually only written into
    the configuration file <code class="filename">postgresql.conf</code> to
    serve as defaults when the server is started.  If you delete these
    assignments from <code class="filename">postgresql.conf</code> then the
    server will inherit the settings from its execution environment.
   </p>
<p>    Note that the locale behavior of the server is determined by the
    environment variables seen by the server, not by the environment
    of any client.  Therefore, be careful to configure the correct locale settings
    before starting the server.  A consequence of this is that if
    client and server are set up in different locales, messages may
    appear in different languages depending on where they originated.
   </p>
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>     When we speak of inheriting the locale from the execution
     environment, this means the following on most operating systems:
     For a given locale category, say the collation, the following
     environment variables are consulted in this order until one is
     found to be set: <code class="envar">LC_ALL</code>, <code class="envar">LC_COLLATE</code>
     (the variable corresponding to the respective category),
     <code class="envar">LANG</code>.  If none of these environment variables are
     set then the locale defaults to <code class="literal">C</code>.
    </p>
<p>     Some message localization libraries also look at the environment
     variable <code class="envar">LANGUAGE</code> which overrides all other locale
     settings for the purpose of setting the language of messages.  If
     in doubt, please refer to the documentation of your operating
     system, in particular the documentation about
     <span class="application">gettext</span>, for more information.
    </p>
</div>
<p>    To enable messages to be translated to the user's preferred language,
    <acronym class="acronym">NLS</acronym> must have been enabled at build time.  This
    choice is independent of the other locale support.
   </p>
</div>
<div class="sect2" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="id663399"></a>21.1.2.Behavior</h3></div></div></div>
<p>    The locale settings influence the following SQL features:

    </p>
<div class="itemizedlist"><ul type="disc">
<li><p>       Sort order in queries using <code class="literal">ORDER BY</code> on textual data
       <a name="id663420"></a>
      </p></li>
<li><p>       The ability to use indexes with <code class="literal">LIKE</code> clauses
       <a name="id663441"></a>
      </p></li>
<li><p>       The <code class="function">upper</code>,  <code class="function">lower</code>,  and <code class="function">initcap</code>
       functions
       <a name="id663476"></a>
       <a name="id663486"></a>
      </p></li>
<li><p>       The <code class="function">to_char</code> family of functions
       <a name="id663507"></a>
      </p></li>
</ul></div>
<p>
   </p>
<p>    The drawback of using locales other than <code class="literal">C</code> or
    <code class="literal">POSIX</code> in <span class="productname">PostgreSQL</span> is its performance
    impact. It slows character handling and prevents ordinary indexes
    from being used by <code class="literal">LIKE</code>. For this reason use locales
    only if you actually need them.
   </p>
<p>    As a workaround to allow <span class="productname">PostgreSQL</span> to use indexes
    with <code class="literal">LIKE</code> clauses under a non-C locale, several custom
    operator classes exist. These allow the creation of an index that
    performs a strict character-by-character comparison, ignoring
    locale comparison rules. Refer to <a href="indexes-opclass.html" title="11.8.Operator Classes">Section11.8, &#8220;Operator Classes&#8221;</a>
    for more information.
   </p>
</div>
<div class="sect2" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="id663578"></a>21.1.3.Problems</h3></div></div></div>
<p>    If locale support doesn't work in spite of the explanation above,
    check that the locale support in your operating system is
    correctly configured.  To check what locales are installed on your
    system, you may use the command <code class="literal">locale -a</code> if
    your operating system provides it.
   </p>
<p>    Check that <span class="productname">PostgreSQL</span> is actually using the locale
    that you think it is.  <code class="envar">LC_COLLATE</code> and <code class="envar">LC_CTYPE</code>
    settings are determined at <code class="command">initdb</code> time and cannot be
    changed without repeating <code class="command">initdb</code>.  Other locale
    settings including <code class="envar">LC_MESSAGES</code> and <code class="envar">LC_MONETARY</code>
    are initially determined by the environment the server is started
    in, but can be changed on-the-fly.  You can check the active locale
    settings using the <code class="command">SHOW</code> command.
   </p>
<p>    The directory <code class="filename">src/test/locale</code> in the source
    distribution contains a test suite for
    <span class="productname">PostgreSQL</span>'s locale support.
   </p>
<p>    Client applications that handle server-side errors by parsing the
    text of the error message will obviously have problems when the
    server's messages are in a different language.  Authors of such
    applications are advised to make use of the error code scheme
    instead.
   </p>
<p>    Maintaining catalogs of message translations requires the on-going
    efforts of many volunteers that want to see
    <span class="productname">PostgreSQL</span> speak their preferred language well.
    If messages in your language are currently not available or not fully
    translated, your assistance would be appreciated.  If you want to
    help, refer to <a href="nls.html" title="Chapter45.Native Language Support">Chapter45, <i>Native Language Support</i></a> or write to the developers'
    mailing list.
   </p>
</div>
</div>
</div></body>
</html>