1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207
|
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>8.4.Binary Data Types</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="datatype.html" title="Chapter8.Data Types">
<link rel="prev" href="datatype-character.html" title="8.3.Character Types">
<link rel="next" href="datatype-datetime.html" title="8.5.Date/Time Types">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="datatype-binary"></a>8.4.Binary Data Types</h2></div></div></div>
<a name="id586201"></a><a name="id586212"></a><p> The <code class="type">bytea</code> data type allows storage of binary strings;
see <a href="datatype-binary.html#datatype-binary-table" title="Table8.6.Binary Data Types">Table8.6, “Binary Data Types”</a>.
</p>
<div class="table">
<a name="datatype-binary-table"></a><p class="title"><b>Table8.6.Binary Data Types</b></p>
<div class="table-contents"><table summary="Binary Data Types" border="1">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>Name</th>
<th>Storage Size</th>
<th>Description</th>
</tr></thead>
<tbody><tr>
<td><code class="type">bytea</code></td>
<td>4 bytes plus the actual binary string</td>
<td>variable-length binary string</td>
</tr></tbody>
</table></div>
</div>
<br class="table-break"><p> A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from character strings by two
characteristics: First, binary strings specifically allow storing
octets of value zero and other “<span class="quote">non-printable</span>”
octets (usually, octets outside the range 32 to 126).
Character strings disallow zero octets, and also disallow any
other octet values and sequences of octet values that are invalid
according to the database's selected character set encoding.
Second, operations on binary strings process the actual bytes,
whereas the processing of character strings depends on locale settings.
In short, binary strings are appropriate for storing data that the
programmer thinks of as “<span class="quote">raw bytes</span>”, whereas character
strings are appropriate for storing text.
</p>
<p> When entering <code class="type">bytea</code> values, octets of certain values
<span class="emphasis"><em>must</em></span> be escaped (but all octet values
<span class="emphasis"><em>can</em></span> be escaped) when used as part of a string
literal in an <acronym class="acronym">SQL</acronym> statement. In general, to
escape an octet, it is converted into the three-digit octal number
equivalent of its decimal octet value, and preceded by two
backslashes. <a href="datatype-binary.html#datatype-binary-sqlesc" title="Table8.7.bytea Literal Escaped Octets">Table8.7, “<code class="type">bytea</code> Literal Escaped Octets”</a> shows the
characters that must be escaped, and gives the alternate escape
sequences where applicable.
</p>
<div class="table">
<a name="datatype-binary-sqlesc"></a><p class="title"><b>Table8.7.<code class="type">bytea</code> Literal Escaped Octets</b></p>
<div class="table-contents"><table summary="bytea Literal Escaped Octets" border="1">
<colgroup>
<col>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>Decimal Octet Value</th>
<th>Description</th>
<th>Escaped Input Representation</th>
<th>Example</th>
<th>Output Representation</th>
</tr></thead>
<tbody>
<tr>
<td>0</td>
<td>zero octet</td>
<td><code class="literal">'\\000'</code></td>
<td><code class="literal">SELECT '\\000'::bytea;</code></td>
<td><code class="literal">\000</code></td>
</tr>
<tr>
<td>39</td>
<td>single quote</td>
<td>
<code class="literal">'\''</code> or <code class="literal">'\\047'</code>
</td>
<td><code class="literal">SELECT '\''::bytea;</code></td>
<td><code class="literal">'</code></td>
</tr>
<tr>
<td>92</td>
<td>backslash</td>
<td>
<code class="literal">'\\\\'</code> or <code class="literal">'\\134'</code>
</td>
<td><code class="literal">SELECT '\\\\'::bytea;</code></td>
<td><code class="literal">\\</code></td>
</tr>
<tr>
<td>0 to 31 and 127 to 255</td>
<td>“<span class="quote">non-printable</span>” octets</td>
<td>
<code class="literal">'\\<em class="replaceable"><code>xxx'</code></em></code> (octal value)</td>
<td><code class="literal">SELECT '\\001'::bytea;</code></td>
<td><code class="literal">\001</code></td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p> The requirement to escape “<span class="quote">non-printable</span>” octets actually
varies depending on locale settings. In some instances you can get away
with leaving them unescaped. Note that the result in each of the examples
in <a href="datatype-binary.html#datatype-binary-sqlesc" title="Table8.7.bytea Literal Escaped Octets">Table8.7, “<code class="type">bytea</code> Literal Escaped Octets”</a> was exactly one octet in
length, even though the output representation of the zero octet and
backslash are more than one character.
</p>
<p> The reason that you have to write so many backslashes, as shown in
<a href="datatype-binary.html#datatype-binary-sqlesc" title="Table8.7.bytea Literal Escaped Octets">Table8.7, “<code class="type">bytea</code> Literal Escaped Octets”</a>, is that an input string
written as a string literal must pass through two parse phases in
the <span class="productname">PostgreSQL</span> server. The first
backslash of each pair is interpreted as an escape character by
the string-literal parser and is therefore consumed, leaving the
second backslash of the pair. The remaining backslash is then
recognized by the <code class="type">bytea</code> input function as starting
either a three digit octal value or escaping another backslash.
For example, a string literal passed to the server as
<code class="literal">'\\001'</code> becomes <code class="literal">\001</code> after
passing through the string-literal parser. The
<code class="literal">\001</code> is then sent to the <code class="type">bytea</code>
input function, where it is converted to a single octet with a
decimal value of 1. Note that the apostrophe character is not
treated specially by <code class="type">bytea</code>, so it follows the normal
rules for string literals. (See also <a href="sql-syntax.html#sql-syntax-strings" title="4.1.2.1.String Constants">Section4.1.2.1, “String Constants”</a>.)
</p>
<p> <code class="type">Bytea</code> octets are also escaped in the output. In general, each
“<span class="quote">non-printable</span>” octet is converted into
its equivalent three-digit octal value and preceded by one backslash.
Most “<span class="quote">printable</span>” octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternative output representation.
Details are in <a href="datatype-binary.html#datatype-binary-resesc" title="Table8.8.bytea Output Escaped Octets">Table8.8, “<code class="type">bytea</code> Output Escaped Octets”</a>.
</p>
<div class="table">
<a name="datatype-binary-resesc"></a><p class="title"><b>Table8.8.<code class="type">bytea</code> Output Escaped Octets</b></p>
<div class="table-contents"><table summary="bytea Output Escaped Octets" border="1">
<colgroup>
<col>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>Decimal Octet Value</th>
<th>Description</th>
<th>Escaped Output Representation</th>
<th>Example</th>
<th>Output Result</th>
</tr></thead>
<tbody>
<tr>
<td>92</td>
<td>backslash</td>
<td><code class="literal">\\</code></td>
<td><code class="literal">SELECT '\\134'::bytea;</code></td>
<td><code class="literal">\\</code></td>
</tr>
<tr>
<td>0 to 31 and 127 to 255</td>
<td>“<span class="quote">non-printable</span>” octets</td>
<td>
<code class="literal">\<em class="replaceable"><code>xxx</code></em></code> (octal value)</td>
<td><code class="literal">SELECT '\\001'::bytea;</code></td>
<td><code class="literal">\001</code></td>
</tr>
<tr>
<td>32 to 126</td>
<td>“<span class="quote">printable</span>” octets</td>
<td>client character set representation</td>
<td><code class="literal">SELECT '\\176'::bytea;</code></td>
<td><code class="literal">~</code></td>
</tr>
</tbody>
</table></div>
</div>
<br class="table-break"><p> Depending on the front end to <span class="productname">PostgreSQL</span> you use,
you may have additional work to do in terms of escaping and
unescaping <code class="type">bytea</code> strings. For example, you may also
have to escape line feeds and carriage returns if your interface
automatically translates these.
</p>
<p> The <acronym class="acronym">SQL</acronym> standard defines a different binary
string type, called <code class="type">BLOB</code> or <code class="type">BINARY LARGE
OBJECT</code>. The input format is different from
<code class="type">bytea</code>, but the provided functions and operators are
mostly the same.
</p>
</div></body>
</html>
|