File: README.Debian

package info (click to toggle)
postgresql 7.4.7-6sarge6
links: PTS
area: main
in suites: sarge
size: 11,168 kB
ctags: 27
sloc: sh: 1,903; makefile: 337; ansic: 204; perl: 69; sed: 6; tcl: 1
file content (261 lines) | stat: -rw-r--r-- 10,647 bytes
PostgreSQL for Debian
=====================

PostgreSQL is the successor to Postgres95, which in turn succeeded
POSTGRES.  PostgreSQL is a relational database with object-oriented
extensions.  It implements the greater part of SQL-92 and is intended
to implement it in full; in addition it supports its own extended
facilities.

Most of the documentation is in the separate package postgresql-doc.

TAKE SPECIAL NOTE of the instructions in README.Debian.migration
(/usr/share/doc/postgresql/README.Debian.migration.gz) about upgrading your
PostgreSQL system to a later version.  Failure to do so could destroy
your data!!!


Location of executables
=======================

Certain executables are linked into /usr/bin, because they are commands
that are more likely to be needed by any user (createdb, createuser, dropdb,
dropuser, pg_config, pg_dump, pg_restore and psql).

Other PostgreSQL executables are in /usr/lib/postgresql/bin, and are in the
default path for the user 'postgres', who is presumed to be the only
person who is likely to run them.

First steps for the impatient
=============================

Eventually you will not get around reading at least some parts of the manual,
but if you want to get straight into playing SQL, here are the steps to create
a database user and a database for the Unix user 'joe':

1. Get a shell for the database superuser 'postgres'. If your system has an
   active root user, use su:

   $ su -c "su postgres"

   If your system uses sudo to get administrative rights, use sudo instead:

   joe$ sudo -u postgres sh

2. In this postgres shell, create a database user with the same name as your
   Unix login:

   $ createuser -A -D joe

   For details about the options, see createuser(1).

3. Create a database "joework" which is owned by "joe":

   $ createdb -O joe joework

   For details about the options, see createdb(1).

4. Exit the postgres shell.

5. As user joe, you should now be able to connect to your database with

   $ psql joework

Configuration
=============

Configuration of the backend is done by editing the configuration files
/etc/postgresql/postmaster.conf and /etc/postgresql/postgresql.conf.
The environment for the postgres user is automatically set up to
include /etc/postgresql/postgresql.env; this file can be modified to
set the value of PGHOST, for example.  Users of PostgreSQL are advised to
source /etc/postgresql/postgresql.env in their startup scripts.

The postmaster program looks in $PGDATA for its configuration files.  This
includes files included using the @file syntax instead of a list.  If you
create new configuration files, therefore, it is important to put them
in that directory.  Creating them in /etc/postgresql will not work (unless
you also create symbolic links from $PGDATA).

Important: if you call initdb manually, please don't forget to call it with the
--debian-conffile option to automatically take care of putting the conffiles
into /etc/postgresql/ and link them to $PGDATA.


Database access control
=======================

Access control is first of all governed by whether the postmaster allows
TCP/IP connections.  If they are not allowed, only Unix socket connections
are used, which limits access to processes running on the same machine.
Next, the configuration file /etc/postgresql/pg_hba.conf allows a
finer-grained control of access by user and by host.

As of 7.1release-3, the default access policy for the Debian packages is that
users connecting through Unix sockets are authenticated against their
Unix login names.  (That is, they can connect to PostgreSQL
only with names that match their Unix login names.)

As of 7.3, Kerberos 5 authorisation was compiled into the release for
the first time, using the Heimdal library.  It now uses the MIT library.
Consult the Administrator's Guide (postgresql-doc package) for information
on configuring it.

Please note that PAM password authentication does NOT work with
pam_unix since PostgreSQL is not run as root and cannot read
/etc/shadow. However, using a module which does not require root
privileges (as pam_ldap) was reported to work.


Performance
===========

If you feel that queries are running slowly, you may need to do some
database administration.  The database must be cleaned regularly with
the VACUUM facility.  This removes deleted tuples and frees the space
they take for re-use.  Vacuums should be done automatically by a cron job
which runs the command do.maintenance.  You can change the cron settings
to run this more frequently.

In 7.3, VACUUMs (apart from VACUUM FULL) can be done while the postmaster
is running, without affecting other users; therefore it may be
beneficial to run do.maintenance several times a day on a heavily used
site.

As of 7.4, autovacuuming can be enabled if the postgresql-contrib package
is installed.  This is configured in /etc/postgresql/postmaster.conf, and
overrides the vacuuming done by cron, unless do.maintenance's -F option
is given.

VACUUM FULL can recover space that normal vacuums cannot; however it
locks tables while it is doing so.  If you can allow it to be run, you
can enable a weekly VACUUM FULL by uncommenting the last line in
/etc/cron.d/postgresql.

Performance may be improved by increasing the number of memory buffers
allocated to PostgreSQL; the default setting allows a relatively small
amount of memory for this.  However, do not increase it above about
one third of total memory, or the gain to PostgreSQL will be offset by
loss of kernel buffers and by increased swapping.

Certain queries may run slowly.  You may find it helpful to create additional
indexes on columns that are frequently used in WHERE clauses.  There is a
mailing list, pgsql-performance@postgresql.org, which is specifically
targeted at PostgreSQL performance.  See http://www.postgresql.org/lists.html

Use EXPLAIN ANALYSE to find out where the query is spending its time.

There is a PostgreSQL mailing list (pgsql-performance@postgresql.org)
devoted to performance issues; check its archives at
http://archives.postgresql.org/


Database locking
================

For a slide presentation by Tom Lane (one of the core developers) see
http://conferences.oreillynet.com/presentations/os2002/lane_tom.tar.gz .
This is a very useful explanation of how locking works and how to ensure
consistency of the database even when multiple users simultaneously
update the same set of multiple rows.


Debian-specific features
========================

There are certain differences between the Debian version of PostgreSQL
and the upstream version.  There are two reasons for this.  First,
because Debian policy requires certain things to be done in a manner
different from that used by the upstream developers, and second, because
I perceive a difference between a piece of software that is put onto 
a machine by an ordinary user and one that is installed, as part of a
distribution, by the system administrator.

1. Environment variables: Debian does not allow packages to depend on users'
   setting environment variables.  For this reason, certain front-end
   programs, especially psql, are run through a wrapper that sets up
   the environment.  Any variables the user does set will, of course,
   be preserved.

2. Default database: the upstream version defaults to a database whose
   name is the same as the name of the PostgreSQL user who is trying to
   access it.  I do not think this is appropriate to a distribution, so
   in Debian, the database must be specified on the command line or in
   the environment variable PGDATABASE.

3. Initial environment: Debian stores its setup files in /etc/postgresql.
   These files are postmaster.conf, postgresql.conf, pg_hba.conf and
   postgresql.env, and any files referenced by pg_hba.conf.  They are
   self-documented, so you are advised to leave the comments alone if you
   edit them.  Where necessary, there are symbolic links to the locations
   where the upstream code expects to find them.  If you use the @file
   syntax to include other files in pg_hba.conf, the file must be put in
   $PGDATA, not /etc/postgresql.conf.

4. Location of socket: since at least 7.0.3, the socket file has been
   located in /var/run/postgresql/rather than in /tmp, so as to avoid
   problems with packages such as tmpreaper and to be more consistent
   with Debian policy.  This location can be altered by setting
   UNIX_SOCKET_DIRECTORY in postgresql.conf.  However, moving it may upset
   any other programs that have been matched to Debian but hardcode the
   socket location.


Creating users
==============

PostgreSQL has its own list of users, who have no necessary connection
with any particular machine's Unix users.

When PostgreSQL is first installed, only one user exists -- 'postgres',
and the default security settings means that that user can only be accessed
from the Unix login 'postgres'.  'postgres' is the PostgreSQL "superuser",
in that it can do anything inside PostgreSQL, just as root can do anything
in Unix.

The Unix user 'postgres' is created as an account into which you cannot
log directly; you must use "su - postgres" from a superuser session.  If
you want people who are not Unix superusers to be able to log into the
Unix account 'postgres', you must assign a password to that account with
the command 'passwd postgres'.

To create other users, you must use either the createuser script from
Unix or the CREATE USER command inside PostgreSQL.

See the man pages:  createuser(1), create_user(7l)


Database design and oids
========================

Oids are internal row identifiers.

Don't use them as keys; they are not guaranteed to be unique and the
package will NOT preserve them in an automatic upgrade.  (This is the
maintainer's deliberate policy.)

If you need a unique number to identify a row, use the SERIAL type.  That
is precisely what it is intended for.

  CREATE TABLE mytable (
     myid    SERIAL  PRIMARY KEY,
     ...
  );

Then some may ask how they can do

  INSERT INTO mytable VALUES (value1, value2, ...);

without specifying the key.  This is another example of bad design.  Suppose
the table is recreated and the order of the fields changes; this statement
would suddenly become incorrect, and you might not even notice.  You should
always do

  INSERT INTO mytable (col1, col2,...) VALUES (value1, value2,...);

so that the values are explicitly attached to named columns.  Then the 
serial column will be updated automatically.  (But if you must do it by
implicit ordering, the keyword DEFAULT can be used to generate a new value.)


 -- Oliver Elphick <olly@lfix.co.uk>, Fri, 25 Oct 2002 07:52:42 +0100