1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261
|
PostgreSQL for Debian
=====================
PostgreSQL is the successor to Postgres95, which in turn succeeded
POSTGRES. PostgreSQL is a relational database with object-oriented
extensions. It implements the greater part of SQL-92 and is intended
to implement it in full; in addition it supports its own extended
facilities.
Most of the documentation is in the separate package postgresql-doc.
TAKE SPECIAL NOTE of the instructions in README.Debian.migration
(/usr/share/doc/postgresql/README.Debian.migration.gz) about upgrading your
PostgreSQL system to a later version. Failure to do so could destroy
your data!!!
Location of executables
=======================
Certain executables are linked into /usr/bin, because they are commands
that are more likely to be needed by any user (createdb, createuser, dropdb,
dropuser, pg_config, pg_dump, pg_restore and psql).
Other PostgreSQL executables are in /usr/lib/postgresql/bin, and are in the
default path for the user 'postgres', who is presumed to be the only
person who is likely to run them.
First steps for the impatient
=============================
Eventually you will not get around reading at least some parts of the manual,
but if you want to get straight into playing SQL, here are the steps to create
a database user and a database for the Unix user 'joe':
1. Get a shell for the database superuser 'postgres'. If your system has an
active root user, use su:
$ su -c "su postgres"
If your system uses sudo to get administrative rights, use sudo instead:
joe$ sudo -u postgres sh
2. In this postgres shell, create a database user with the same name as your
Unix login:
$ createuser -A -D joe
For details about the options, see createuser(1).
3. Create a database "joework" which is owned by "joe":
$ createdb -O joe joework
For details about the options, see createdb(1).
4. Exit the postgres shell.
5. As user joe, you should now be able to connect to your database with
$ psql joework
Configuration
=============
Configuration of the backend is done by editing the configuration files
/etc/postgresql/postmaster.conf and /etc/postgresql/postgresql.conf.
The environment for the postgres user is automatically set up to
include /etc/postgresql/postgresql.env; this file can be modified to
set the value of PGHOST, for example. Users of PostgreSQL are advised to
source /etc/postgresql/postgresql.env in their startup scripts.
The postmaster program looks in $PGDATA for its configuration files. This
includes files included using the @file syntax instead of a list. If you
create new configuration files, therefore, it is important to put them
in that directory. Creating them in /etc/postgresql will not work (unless
you also create symbolic links from $PGDATA).
Important: if you call initdb manually, please don't forget to call it with the
--debian-conffile option to automatically take care of putting the conffiles
into /etc/postgresql/ and link them to $PGDATA.
Database access control
=======================
Access control is first of all governed by whether the postmaster allows
TCP/IP connections. If they are not allowed, only Unix socket connections
are used, which limits access to processes running on the same machine.
Next, the configuration file /etc/postgresql/pg_hba.conf allows a
finer-grained control of access by user and by host.
As of 7.1release-3, the default access policy for the Debian packages is that
users connecting through Unix sockets are authenticated against their
Unix login names. (That is, they can connect to PostgreSQL
only with names that match their Unix login names.)
As of 7.3, Kerberos 5 authorisation was compiled into the release for
the first time, using the Heimdal library. It now uses the MIT library.
Consult the Administrator's Guide (postgresql-doc package) for information
on configuring it.
Please note that PAM password authentication does NOT work with
pam_unix since PostgreSQL is not run as root and cannot read
/etc/shadow. However, using a module which does not require root
privileges (as pam_ldap) was reported to work.
Performance
===========
If you feel that queries are running slowly, you may need to do some
database administration. The database must be cleaned regularly with
the VACUUM facility. This removes deleted tuples and frees the space
they take for re-use. Vacuums should be done automatically by a cron job
which runs the command do.maintenance. You can change the cron settings
to run this more frequently.
In 7.3, VACUUMs (apart from VACUUM FULL) can be done while the postmaster
is running, without affecting other users; therefore it may be
beneficial to run do.maintenance several times a day on a heavily used
site.
As of 7.4, autovacuuming can be enabled if the postgresql-contrib package
is installed. This is configured in /etc/postgresql/postmaster.conf, and
overrides the vacuuming done by cron, unless do.maintenance's -F option
is given.
VACUUM FULL can recover space that normal vacuums cannot; however it
locks tables while it is doing so. If you can allow it to be run, you
can enable a weekly VACUUM FULL by uncommenting the last line in
/etc/cron.d/postgresql.
Performance may be improved by increasing the number of memory buffers
allocated to PostgreSQL; the default setting allows a relatively small
amount of memory for this. However, do not increase it above about
one third of total memory, or the gain to PostgreSQL will be offset by
loss of kernel buffers and by increased swapping.
Certain queries may run slowly. You may find it helpful to create additional
indexes on columns that are frequently used in WHERE clauses. There is a
mailing list, pgsql-performance@postgresql.org, which is specifically
targeted at PostgreSQL performance. See http://www.postgresql.org/lists.html
Use EXPLAIN ANALYSE to find out where the query is spending its time.
There is a PostgreSQL mailing list (pgsql-performance@postgresql.org)
devoted to performance issues; check its archives at
http://archives.postgresql.org/
Database locking
================
For a slide presentation by Tom Lane (one of the core developers) see
http://conferences.oreillynet.com/presentations/os2002/lane_tom.tar.gz .
This is a very useful explanation of how locking works and how to ensure
consistency of the database even when multiple users simultaneously
update the same set of multiple rows.
Debian-specific features
========================
There are certain differences between the Debian version of PostgreSQL
and the upstream version. There are two reasons for this. First,
because Debian policy requires certain things to be done in a manner
different from that used by the upstream developers, and second, because
I perceive a difference between a piece of software that is put onto
a machine by an ordinary user and one that is installed, as part of a
distribution, by the system administrator.
1. Environment variables: Debian does not allow packages to depend on users'
setting environment variables. For this reason, certain front-end
programs, especially psql, are run through a wrapper that sets up
the environment. Any variables the user does set will, of course,
be preserved.
2. Default database: the upstream version defaults to a database whose
name is the same as the name of the PostgreSQL user who is trying to
access it. I do not think this is appropriate to a distribution, so
in Debian, the database must be specified on the command line or in
the environment variable PGDATABASE.
3. Initial environment: Debian stores its setup files in /etc/postgresql.
These files are postmaster.conf, postgresql.conf, pg_hba.conf and
postgresql.env, and any files referenced by pg_hba.conf. They are
self-documented, so you are advised to leave the comments alone if you
edit them. Where necessary, there are symbolic links to the locations
where the upstream code expects to find them. If you use the @file
syntax to include other files in pg_hba.conf, the file must be put in
$PGDATA, not /etc/postgresql.conf.
4. Location of socket: since at least 7.0.3, the socket file has been
located in /var/run/postgresql/rather than in /tmp, so as to avoid
problems with packages such as tmpreaper and to be more consistent
with Debian policy. This location can be altered by setting
UNIX_SOCKET_DIRECTORY in postgresql.conf. However, moving it may upset
any other programs that have been matched to Debian but hardcode the
socket location.
Creating users
==============
PostgreSQL has its own list of users, who have no necessary connection
with any particular machine's Unix users.
When PostgreSQL is first installed, only one user exists -- 'postgres',
and the default security settings means that that user can only be accessed
from the Unix login 'postgres'. 'postgres' is the PostgreSQL "superuser",
in that it can do anything inside PostgreSQL, just as root can do anything
in Unix.
The Unix user 'postgres' is created as an account into which you cannot
log directly; you must use "su - postgres" from a superuser session. If
you want people who are not Unix superusers to be able to log into the
Unix account 'postgres', you must assign a password to that account with
the command 'passwd postgres'.
To create other users, you must use either the createuser script from
Unix or the CREATE USER command inside PostgreSQL.
See the man pages: createuser(1), create_user(7l)
Database design and oids
========================
Oids are internal row identifiers.
Don't use them as keys; they are not guaranteed to be unique and the
package will NOT preserve them in an automatic upgrade. (This is the
maintainer's deliberate policy.)
If you need a unique number to identify a row, use the SERIAL type. That
is precisely what it is intended for.
CREATE TABLE mytable (
myid SERIAL PRIMARY KEY,
...
);
Then some may ask how they can do
INSERT INTO mytable VALUES (value1, value2, ...);
without specifying the key. This is another example of bad design. Suppose
the table is recreated and the order of the fields changes; this statement
would suddenly become incorrect, and you might not even notice. You should
always do
INSERT INTO mytable (col1, col2,...) VALUES (value1, value2,...);
so that the values are explicitly attached to named columns. Then the
serial column will be updated automatically. (But if you must do it by
implicit ordering, the keyword DEFAULT can be used to generate a new value.)
-- Oliver Elphick <olly@lfix.co.uk>, Fri, 25 Oct 2002 07:52:42 +0100
|