File: copy.l.html

package info (click to toggle)
mpsql 2.0-2
links: PTS
area: non-free
in suites: slink
size: 2,912 kB
ctags: 5,665
sloc: ansic: 34,322; makefile: 3,525; sh: 17
file content (142 lines) | stat: -rw-r--r-- 7,333 bytes
<!-- manual page source format generated by PolyglotMan v3.0.4, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->

<HTML>
<HEAD>
<TITLE>COPY(SQL) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="sql.html">SQL Reference Contents</A>
 
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
copy - copy data to or from a class from or to a Unix file.  
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS 
</A></H2>
<B>copy </B> [<B>binary </B>] classname [<B>with oids </B>] <BR>
 <tt> </tt><tt> </tt><B>to </B>|<B>from ' </B>filename<B>' </B>|<B>stdin </B>|<B>stdout 
</B> <BR>
 <tt> </tt><tt> </tt>[<B>using delimiters ' </B>delim<B>' </B>] <BR>
  
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
<B>Copy</B> moves data between Postgres 
classes and standard Unix files.  The keyword <B>binary</B> changes the behavior 
of field formatting, as described below. <I>Classname</I> is the name of an existing 
class. The keyword <B>with oids</B> copies the internal unique object id (OID) 
for each row. <I>Classname</I> is the name of an existing class. <I>Filename</I> is the 
absolute Unix pathname of the file.  In place of a filename, the keywords 
<B>stdin</B> and <B>stdout</B> can be used so that input to <B>copy</B> can be written by a 
Libpq application and output from the <B>copy</B> command can be read by a Libpq 
application. <P>
The <B>binary</B> keyword will force all data to be stored/read as 
binary objects rather than as ASCII text.  It is somewhat faster than the 
normal <B>copy</B> command, but is not generally portable, and the files generated 
are somewhat larger, although this factor is highly dependent on the data 
itself. By default, a ASCII <B>copy</B> uses a tab (\t) character as a delimiter. 
 The delimiter may also be changed to any other single-character with the 
use of  <B>using delimiters</B>. Characters in data fields which happen to match 
the delimiter character will be quoted. <P>
You must have read access on any 
class whose values are read by the <B>copy</B> command, and either write or append 
access to a class to which values are being appended by the <B>copy</B> command. 
 
<H2><A NAME="sect3" HREF="#toc3">FORMAT OF OUTPUT FILES </A></H2>
 
<H3><A NAME="sect4" HREF="#toc4">ASCII COPY FORMAT </A></H3>
When <B>copy</B> is used without the 
<B>binary</B> keyword, the file generated will have each instance on a line, 
with each attribute separated by the delimiter character.  Embedded delimiter 
characters will be preceeded by a backslash character (\).  The attribute 
values themselves are strings generated by the output function associated 
with each attribute type.  The output function for a type should not try 
to generate the backslash character; this will be handled by  <B>copy</B> itself. 
<P>
The actual format for each instance is &lt;attr1&gt;&lt;tab&gt;&lt;attr2&gt;&lt;tab&gt;...&lt;tab&gt;&lt;attrn&gt;&lt;newline&gt; 
<BR>
 The oid is placed on the beginning of the line if specified. <P>
If  <B>copy</B> 
is sending its output to standard output instead of a file, it will send 
a backslash(\) and a period (.) followed immediately by a newline, on a 
line by themselves, when it is done.  Similarly, if <B>copy</B> is reading from 
standard input, it will expect a backslash (\) and a period (.) followed 
by a newline, as the first three characters on a line, to denote end-of-file. 
 However, <B>copy</B> will terminate (followed by the backend itself) if a true 
EOF is encountered. <P>
The backslash character has special meaning. <B>NULL</B> attributes 
are output as \N. A literal backslash character is output as two consecutive 
backslashes. A literal tab character is represented as a backslash and 
a tab. A literal newline character is represented as a backslash and a 
newline. When loading ASCII data not generated by PostgreSQL, you will 
need to convert backslash characters (\) to double-backslashes (\\) so they 
are loaded properly.  
<H3><A NAME="sect5" HREF="#toc5">BINARY COPY FORMAT </A></H3>
In the case of <B>copy binary</B>, the 
first four bytes in the file will be the number of instances in the file. 
 If this number is <I>zero,</I> the <B>copy binary</B> command will read until end of 
file is encountered.  Otherwise, it will stop reading when this number 
of instances has been read. Remaining data in the file will be ignored. 
<P>
The format for each instance in the file is as follows.  Note that this 
format must be followed <B>EXACTLY</B>. Unsigned four-byte integer quantities are 
called uint32 in the below description. The first value is: <BR>
 <tt> </tt><tt> </tt>uint32 number 
of tuples <BR>
 then for each tuple: <BR>
 <tt> </tt><tt> </tt>uint32 total length of data segment <BR>
 
<tt> </tt><tt> </tt>uint32 oid (if specified) <BR>
 <tt> </tt><tt> </tt>uint32 number of null attributes <BR>
 <tt> </tt><tt> </tt>[uint32 attribute 
number of first null attribute <BR>
 <tt> </tt><tt> </tt> ... <BR>
 <tt> </tt><tt> </tt> uint32 attribute number of nth null 
attribute], <BR>
 <tt> </tt><tt> </tt>&lt;data segment&gt; <BR>
   
<H3><A NAME="sect6" HREF="#toc6">ALIGNMENT OF BINARY DATA </A></H3>
On Sun-3s, 2-byte 
attributes are aligned on two-byte boundaries, and all larger attributes 
are aligned on four-byte boundaries.  Character attributes are aligned on 
single-byte boundaries.  On other machines, all attributes larger than 1 
byte are aligned on four-byte boundaries. Note that variable length attributes 
are preceded by the attribute's length; arrays are simply contiguous streams 
of the array element type.  
<H2><A NAME="sect7" HREF="#toc7">SEE ALSO </A></H2>
<A HREF="insert.l.html">insert(l)</A>
, create <A HREF="table.l.html">table(l)</A>
, <A HREF="vacuum.l.html">vacuum(l)</A>
, 
libpq.  
<H2><A NAME="sect8" HREF="#toc8">BUGS </A></H2>
Files used as arguments to the <B>copy</B> command must reside on 
or be accessible to the the database server machine by being either on 
local disks or a networked file system. <P>
<B>Copy</B> stops operation at the first 
error.  This should not lead to problems in the event of a <B>copy from</B>, but 
the target relation will, of course, be partially modified in a <B>copy to</B>. 
The  <I><A HREF="vacuum.l.html">vacuum</I>(l)</A>
 query should be used to clean up after a failed <B>copy</B>. <P>
Because 
Postgres operates out of a different directory than the user's working 
directory at the time Postgres is invoked, the result of copying to a 
file `foo' (without additional path information) may yield unexpected results 
for the naive user.  In this case, `foo' will wind up in <FONT SIZE=-1>$PGDATA</FONT>
/foo.  In 
general, the full pathname should be used when specifying files to be 
copied. <P>

<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">FORMAT OF OUTPUT FILES</A></LI>
<UL>
<LI><A NAME="toc4" HREF="#sect4">ASCII COPY FORMAT</A></LI>
<LI><A NAME="toc5" HREF="#sect5">BINARY COPY FORMAT</A></LI>
<LI><A NAME="toc6" HREF="#sect6">ALIGNMENT OF BINARY DATA</A></LI>
</UL>
<LI><A NAME="toc7" HREF="#sect7">SEE ALSO</A></LI>
<LI><A NAME="toc8" HREF="#sect8">BUGS</A></LI>
</UL>
</BODY></HTML>