1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430
|
Debian PostgreSQL - migration between database formats
======================================================
When the database format changes, as frequently happens when the minor
release number changes, the database must be dumped and reinitialised.
The procedure is explained in the INSTALL instructions from upstream
and may briefly be described as follows:
1. Use pg_dumpall to dump the database.
2. Destroy the old database
3. Use initdb to create a new database structure.
4. Use psql to read in the dump from the old database.
Database object dependencies
============================
If you have upgraded to 7.3, you will probably want to run the adddepend
script (in postgresql-contrib) to create the missing dependencies.
Automatic upgrade by the postinst script
========================================
The postinst asks for permission to attempt the upgrade. If you say
yes, it runs a script called postgresql-dump which attempts to
automate all the procedure described above. If this is successful,
no further action is necessary.
If automatic upgrading fails, you must do a manual upgrade.
Manual upgrade procedure
========================
WARNING: You will have problems if you upgrade in large steps (for
instance from 7.0 to 7.3). This is because at each release the upstream
developers assume that you are upgrading from the immediately preceding
release. If you upgrade at longer intervals, incompatibilities may
arise between the pg_dump of the previous version and the database
infrastructure of the new version. In such circumstances, you will
probably have to edit your dump file before you can restore your data.
This will make you considerably more expert in the use of regular
expressions...
If you already have an up-to-date full database dump, you can use the
manual procedure documented by the upstream developers in
/usr/share/doc/postgresql/INSTALL.gz.
If you have not got a dump, you will first need to make one using the
copies of the previous release software which should have been saved in
/var/lib/postgres/dumpall/<version_number>. This procedure has been
encapsulated in a script, postgresql-dump(8).
Follow this procedure to dump your old database:
upgrading from <= 6.4.2
# su -s /bin/bash - postgres
$ postgresql-dump -t db.out -dcifvlp $PGDATA/../data.save
upgrading from >= 6.5
# su -s /bin/bash - postgres
$ postgresql-dump -t db.out -dcivlp $PGDATA/../data.save
(But see the notes below about large objects (blobs) and about oids.)
This will dump the database to db.out, in the postgres home directory, list the
dump on screen for the user to inspect and approve, destroy the old database,
create the new database with initdb (don't forget the --debian-conffile option)
and finally reload the data from db.out into the new database. The old
database will be saved in $PGDATA/../data.save in case anything goes wrong.
If you do not have enough space for multiple copies of your data, you
can use single options of postgresql-dump to do the dump and restore
one step at a time. For example:
postgresql-dump -t /dev/st0 # dump to tape
postgresql-dump -c -t /dev/st0 # check tape dump
postgresql-dump -d -t /dev/st0 -i -l # destroy the old database
# create a new one and
# load the dump
It is highly recommended, of course, that you should have a backup of
${PGDATA} before destroying the database! Such a backup should have been
taken when the postmaster was shut down.
pg_upgrade
==========
There used to be a program called pg_upgrade, which could upgrade data
from one version to the next without the need for a full data dump.
This program does not work at version 7.1 and later, and has been disabled by
the upstream developers. The -u option (which invoked pg_upgrade)
has been removed from postgresql-dump at 7.1.
If you need to use pg_dumpall to dump your database, be sure not to use
the -s option (to dump the schema only). You must dump all the data
as well.
(A new, experimental version of pg_upgrade was included in 7.2 contrib. The
Debian upgrade procedure will not attempt to use it until it has been
thoroughly tested.)
==========================================================================
Possible problems
==========================================================================
In order to be able to do the upgrade, we need to use binaries that
match the old database. This used to be done by the preinst script,
which saved the binaries before overwriting them.
With the advent of apt, however, this proved inadequate, since apt
will often remove a package rather than overwrite it, in order to cope
with interlinked dependencies. If this happens, the previous binaries
are no longer available to be preserved. As from 6.5.3-16 and 7.0.2-xx,
the prerm script is used instead to preserve old binaries.
In one way or another, therefore, the installation should have preserved
the executables and shared libraries of an earlier package release. These
are stored in /var/lib/postgres/dumpall/<database_version>. These executables
are saved so that they can be used to dump the old database. Once your
database is up-to-date, there is no need to retain these executables and the
directory that they are saved in can be deleted.
Provided that these executables have been successfully captured, the
old database can be dumped, deleted and reloaded by the command
postgresql-dump. This must be run by the PostgreSQL administrator,
postgres. See the postgresql-dump(8) manpage for full details of how to
use it.
If the executables are not preserved, you will have to reinstall a
package of the appropriate release level to read your database. You
should run pg_dumpall to an output file of your choice before reinstalling
the latest version of the package.
Wrong program version saved
===========================
People who follow unstable may have installed beta versions of
PostgreSQL. These sometimes suffer database format changes which
are not reflected by a version number change. As a result, the
prerm scripts may see software already present in
/var/lib/postgres/dumpall/<database_version> and therefore fail to
update it with the correct version. It is well therefore to delete
this directory as soon as your database is up-to-date with the
software, to avoid problems in the future.
If you hit this problem, you need to go back and reinstall the
previous package version corresponding to your database, so that you
can do a dump.
Unrestorable dumps
==================
It is possible for pg_dump to produce a script that cannot be restored.
For example:
CREATE TABLE "account" (
"code" character varying(16) NOT NULL,
"ledger" character varying(16) NOT NULL,
"title" character varying(128) NOT NULL,
"balance" numeric(15,2) DEFAULT 0 NOT NULL,
"class" character(2) NOT NULL,
"allow_subac" boolean DEFAULT 'f' NOT NULL,
"restrict_co" boolean DEFAULT 'f',
"parent" character varying(16),
CONSTRAINT "parent in same ledger and allow"
CHECK (((parent IS NULL) OR valid_parent(parent, ledger))),
Constraint "account_pkey" Primary Key ("code")
);
CREATE FUNCTION "valid_parent" (character varying,character varying)
RETURNS boolean AS '-- $1 = parent code, $2 = ledger
SELECT (CASE WHEN ledger = $2 THEN ''t'' ELSE ''f'' END) ~ ''t''
FROM account
WHERE code = $1;' LANGUAGE 'sql';
The CREATE TABLE statement will fail because the function valid_parent() is
not yet defined.
In order to cure this, the dump file must be manually edited. It is not
enough in this case to reverse the order of creation of the table and
function, because they are mutually dependent. Instead, the table must be
created without the constraint; then the function is created; and finally,
the constraint is added to the table:
CREATE TABLE "account" (
"code" character varying(16) NOT NULL,
"ledger" character varying(16) NOT NULL,
"title" character varying(128) NOT NULL,
"balance" numeric(15,2) DEFAULT 0 NOT NULL,
"class" character(2) NOT NULL,
"allow_subac" boolean DEFAULT 'f' NOT NULL,
"restrict_co" boolean DEFAULT 'f',
"parent" character varying(16),
Constraint "account_pkey" Primary Key ("code")
);
CREATE FUNCTION "valid_parent" (character varying,character varying)
RETURNS boolean AS '-- $1 = parent code, $2 = ledger
SELECT (CASE WHEN ledger = $2 THEN ''t'' ELSE ''f'' END) ~ ''t''
FROM account
WHERE code = $1;' LANGUAGE 'sql';
ALTER TABLE "acount" ADD CONSTRAINT "parent in same ledger and allow"
CHECK (((parent IS NULL) OR valid_parent(parent, ledger)));
Dependency analysis has been begun in 7.3 and will be improved in future
releases; this may avoid such problems in the future.
External functions may need recompiling
=======================================
If any of your functions use C or another external language, they may
need recompiling. A typical symptom to indicate this is an unexpected
crash of the database backend while loading data, where a C function is
used in a constraint.
When recompiling, you should note that header file paths in postgresql-dev
may now need a directory prepended, because of the reorganisation of the
header file tree structure..
Rule syntax
===========
At 6.5, the use of the keyword `current' in CREATE RULE to refer to the
table being updated was changed to `old'. Dump scripts produced by 6.4.2
and earlier will therefore not load correctly on 6.5 and later, because all
occurrences of `current.' have to be changed to `old.'. postgresql-dump
will do this for you if you give it the option -f. This option passes
the dump file through "sed -e '/^CREATE RULE /s/current\./old./g'". If
this is wrong for your data but your data needs conversion, you should
dump the data, edit it and then reload it. A close study of the manual
page for postgresql-dump is recommended.
[NOTE: I have no idea whether a direct load of a 6.5 dump into 7.3 will work
at all!]
UNICODE databases
=================
There is a problem in 7.0 (at least) with dumping some UNICODE characters.
Due to bad interaction between components, it is possible for 7.0 pg_dump
(which has to be used to dump the 7.0 format database) to truncate data
lines at some characters -- for example the Norwegian 0370 (o with a line
through it). Such a database needs to be dumped with the -d option to
pg_dumpall, so as to create the data by INSERT statements rather than by COPY.
Since this is much slower, it cannot be used as the default.
If this problem affects you, you should do a manual dump with pg_dumpall -d
before upgrading your postgresql packages.
A second UNICODE problem was found with 7.2. At that release, PostgreSQL
began to check the UNICODE characters and rejected some that were accepted
by 7.1 and earlier. These were in fact single-byte characters with the
eighth bit set; not UNICODE at all. However, they were mistaken for UNICODE
lead-in characters and rejected as out of range. The solution in this case
is to restore the database into a LATIN9 database rather than a UNICODE one.
This must be done manually.
Large objects (blobs)
=====================
pg_dump cannot dump large objects to a text file, which is the method used
by postgresql-dump. Therefore, if you have any large objects, you must do
a manual dump to a tar or custom format dump. (Since an ordinary text or
bytea column can hold items up to 1Gb or so, thanks to TOAST, you may well
be able to avoid using large objects at all.)
oids
====
Some schema designs rely on the use of oids as row identifiers. This is
definitely not recommended, not least because oids are not guaranteed to
exist in all future versions of PostgreSQL. Oids are an internal feature
only. They are not suitable as candidate keys, since they are not guaranteed
to be unique; furthermore, the starting point for oids is likely to change
whenever a change to the database structure occurs.
Therefore postgresql-dump does NOT include an option to preserve oids.
If you really insist on implementing and continuing a bad design, you will
need to edit the postgresql-dump script to allow it; or start the preserved
postmaster by hand (remember to use LD_PRELOAD to load the correct libpq.so).
Bugs in old versions of pg_dump
===============================
There are bugs in old versions of pg_dump that cause the output not
to be sufficient to recreate the database exactly. In successive
releases these bugs have been very substantially reduced, but some still
may exist. In the process of migration, you will suffer the bugs of the
version from which you are upgrading, and this new release cannot do
anything about them.
Do not delete the saved copy of your database until you are sure that all
your data has been correctly transferred. If you find errors in
reloading your database, you will probably be able to cure them by
editing the ASCII dump to correct the SQL commands it contains.
Upgrading from postgres95 or any non-Debianised version of Postgres
===================================================================
If the preinstallation script finds a data/base directory where it
expects to put its own, or if it finds an executable called postgres
in /usr/bin, it will assume there is an older version installed
and will attempt to capture the old executables as described above.
If it succeeds, it will put them in /var/lib/postgres/dumpall/unknown.
If it fails, the installation should fail, and you will have to do a manual
dump and restore before you can continue, like this:
Check in the FAQ, in this documentation, whether and how to dump your data.
Some older versions require you to load intermediate versions in
order to preserve data integrity through to the current version.
For example, you cannot go directly from Postgres95 1.08 to PostgreSQL 6;
you must dump and reload into Postgres95 1.09 first. (If you actually
need to do this, I can no longer help you; these software versions are
not just old - they are antediluvian!)
Older versions of the pg_dumpall command were liable to lose data about
SQL permissions and users. You may have to live with this.
Make sure you have an up-to-date backup; be wary of using normal Linux backup
utilities while the postmaster is running, or you may find on recovery
that your database is corrupt.
# su -s /bin/bash - postgres # become the Postgres superuser
$ pg_dumpall >target_file # target_file may be a tape or on disk
When this procedure is complete, read through the resulting archive
to ensure that it is correct and can be read to the end. The dump
format is ASCII text. As at release 6.2.1, pg_dumpall loses table
ownerships and permissions. At 6.5 it had fairly few problems; at 7.1
it was greatly improved and at 7.3 is better still.
When you are completely satisfied that you have a readable backup of
your database:
(PGDATA should be set; if it isn't, set it thus:
$ export PGDATA=/path_to_database/data
$ export PGLIB=/usr/lib/postgresql
)
$ $PGLIB/bin/cleardbdir
(if, for some reason, that didn't work:
$ rm -rf $PGDATA/*
)
This will destroy the old Postgres95 or PostgreSQL database, so don't
do it until you are absolutely certain about your data!
Finish this orgy of destruction by removing the old package:
# dpkg --remove <packagenames>
or find the various files and delete them if Postgres wasn't Debianised
before.
When the database has been destroyed, create a new one with initdb.
Start the postmaster (as root):
# /etc/init.d/postgresql start
Finally, reload your database:
# su -s /bin/bash - postgres
$ psql -e <target_file
(If you are unlucky, you may have to do more or less extensive editing
of target_file before you can reload it.)
Private databases
=================
It is possible to use initdb to create a private database. Such a database
must be accessed using different values of PGPORT and PGDATA.
The installation scripts only know about the main installation. If there
are any private databases, they must be upgraded by hand. At the moment,
there are no mechanisms provided to do this; the best way is to copy the
actions performed by postgresql-dump.
If private database owners wish to avoid problems they will keep regular
up-to-date backups of their databases, so that they will not need to use
old versions of postgresql to dump their data on an upgrade.
Co-existing with local installations
====================================
If someone has created a local copy of PostgreSQL, in /usr/local/pgsql,
for example, the two copies will clash, because both will be listening
on port 5432. One of the postmasters will be unable to start, because
it will be blocked by the existing socket /var/run/.s.PGSQL.5432, which the
other will have opened. Alternatively, it may be using an unaltered
upstream version with the socket in /tmp/.s.PGSQL.5432. In this case there
is even more scope for confusion - which database you get depends on which
version of the libpq shared library is loaded by the front-end program.
If you do not wish to delete the local copy, you must make sure that the
two copies use different databases ($PGDATA) and listen on different
ports. You can either reconfigure and recompile the local version, or
make sure it and its front-ends always run with PGPORT set to the
desired port, or you can set the value of PORT in
/etc/postgresql/postgresql.conf. Do not use 5341, because that is
used by the installation scripts for special purposes. Do not use any
port that is used by any other TCP/IP service.
Oliver Elphick <olly@lfix.co.uk>
Last update: 3rd June 2003
|