File: encoding-requirements.xml

package info (click to toggle)
php-doc 20100521-2
  • links: PTS, VCS
  • area: main
  • in suites: squeeze, wheezy
  • size: 59,992 kB
  • ctags: 4,085
  • sloc: xml: 796,833; php: 21,338; cpp: 500; sh: 117; makefile: 58; awk: 28
file content (110 lines) | stat: -rw-r--r-- 3,129 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
<?xml version="1.0" encoding="utf-8"?>
<!-- $Revision: 297028 $ -->

<chapter xml:id="mbstring.php4.req" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
 <title>PHP Character Encoding Requirements</title>
 <para>
  Encodings of the following types are safely used with PHP.
  <itemizedlist>
   <listitem>
    <para>
     A singlebyte encoding,
     <itemizedlist>
      <listitem>
       <simpara>
        which has ASCII-compatible (ISO646 compatible) mappings for the
        characters in range of <literal>00h</literal> to
        <literal>7fh</literal>.
       </simpara>
      </listitem>
     </itemizedlist>
    </para>
   </listitem>
   <listitem>
    <para>
     A multibyte encoding,
     <itemizedlist>
      <listitem>
       <simpara>
        which has ASCII-compatible mappings for the characters in range of
        <literal>00h</literal> to <literal>7fh</literal>.
       </simpara>
      </listitem>
      <listitem>
       <simpara>
        which don't use ISO2022 escape sequences.
       </simpara>
      </listitem>
      <listitem>
       <simpara>
        which don't use a value from <literal>00h</literal> to
        <literal>7fh</literal> in any of the compounded bytes
        that represents a single character.
       </simpara>
      </listitem>
     </itemizedlist>  
    </para>
   </listitem>
  </itemizedlist>
 </para>
 <para>
  These are examples of character encodings that are unlikely to work
  with PHP.
  <informalexample>
   <programlisting>
<![CDATA[
JIS, SJIS, ISO-2022-JP, BIG-5
]]>
   </programlisting>
  </informalexample>
 </para>
 <para>
  Although PHP scripts written in any of those encodings might not work,
  especially in the case where encoded strings appear as identifiers
  or literals in the script, you can almost avoid using these encodings
  by setting up the <literal>mbstring</literal>'s transparent encoding
  filter function for incoming HTTP queries.
 </para>
 <note>
  <para>
   It's highly discouraged to use SJIS, BIG5, CP936, CP949 and GB18030 for
   the internal encoding unless you are familiar with the parser, the
   scanner and the character encoding.
  </para>
 </note>
 <note>
  <para>
   If you are connecting to a database with PHP, it is recommended that
   you use the same character encoding for both the database and the
   <literal>internal encoding</literal> for ease of use and better
   performance.
  </para>
  <para>
   If you are using PostgreSQL, the character encoding used in the
   database and the one used in PHP may differ as it supports
   automatic character set conversion between the backend and the frontend.
  </para>
 </note>
</chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode: sgml
sgml-omittag:t
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
indent-tabs-mode:nil
sgml-parent-document:nil
sgml-default-dtd-file:"~/.phpdoc/manual.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:nil
sgml-local-ecat-files:nil
End:
vim600: syn=xml fen fdm=syntax fdl=2 si
vim: et tw=78 syn=sgml
vi: ts=1 sw=1
-->