File: architecture.html

package info (click to toggle)
postgresql-common 71
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 584 kB
  • ctags: 96
  • sloc: perl: 2,158; sh: 215; makefile: 12
file content (181 lines) | stat: -rw-r--r-- 8,739 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <meta name="author" content="Oliver Elphick, Martin Pitt" />
  <title>Multiversion/Multicluster PostgreSQL architecture</title>
</head>
<body>

<h1>Multi-Version/Multi-Cluster PostgreSQL architecture</h1>


<h2>Solving a problem</h2>

<p>
When a new major version of PostgreSQL is released, it is necessary to
dump and reload the database.  The old software must be used for the dump,
and the new software for the reload.</p>

<p>This was a major problem for Red Hat and Debian, because a dump and reload was
not required by every upgrade and by the time the need for a dump is
realised, the old software might have been deleted.  Debian had certain rather
unreliable procedures to save the old software and use it to do a dump, but
these procedures often went wrong.  Red Hat's installation environment is so
rigid that it is not practicable for the Red Hat packages to attempt an
automatic upgrade.  Debian offered a debconf choice for
whether to attempt automatic upgrading; if it failed or was not allowed, a 
manual upgrade had to be done, either from a pre-existing dump or by
manual invocation of the postgresql-dump script.</p>

<p>There was once an upstream program called <b>pg_upgrade</b> which could be
used for in-place upgrading.  This does not currently work and does not seem to
be a high priority with upstream developers.  </p>

<p>It is possible to run different versions of PostgreSQL simultaneously, and
indeed to run the same version on separate database clusters simultaneously.
To do so, each postmaster must listen on a different port, so each client
must specify the correct port.  By having two separate
versions of the PostgreSQL packages installed simultaneously, it is
simple to do database upgrades by dumping from the old version and
uploading to the new.  The PostgreSQL client wrapper is designed to 
permit this.</p>

<h2>General Architecture idea</h2>

<p>The Debian packaging has been changed to create a new package for each major
version.  The criterion for creating a new package is that initdb is required
when upgrading from the previous version. Thus, there are now source packages
<code>postgresql-7.4</code> and <code>postgresql-8.0</code> (and similarly for
all the binary packages).</p>

<p>The legacy postgresql and the other existing binary package names have
become dummy packages depending on one of the versioned equivalents. Their only
purpose is now to ensure a smooth upgrade and to register the existing database
cluster to the new architecture. These packages will be removed from the
archive as soon as the next Debian release after Sarge (Etch) is released.</p>

<p>Each versioned package installs into
<code>/usr/lib/postgresql/<i>version</i></code>.  In order to allow users
easily to select the right version and cluster when working, the
<code>postgresql-common</code> package provides the <b>pg_wrapper</b> program,
which reads the per-user and system wide configuration file and forks the
correct executable with the correct library versions according to those
preferences.  <code>/usr/bin</code> provides executables soft-linked to
pg_wrapper.</p>

<p>This architecture also allows separate database clusters to be maintained
for the use of different groups of users; these clusters need not all be of the
same major version.  This allows much greater flexibility for those people
who need to make application software changes consequent on a PostgreSQL
upgrade.</p>

<h2>Detailed structure</h2>

<h3>Configuration hierarchy</h3>

<table cellpadding="6" cellspacing="0" border="0">
<tr><td><code>/etc/postgresql-common/user_clusters</code></td> <td>maps users
against clusters and default databases</td></tr>

<tr><td><code>$HOME/.postgresqlrc</code></td> <td>per-user preferences for
default version/cluster and database; overrides
<code>/etc/postgresql-common/user_clusters</code></td></tr>

<tr>
  <td><code>/etc/postgresql-common/autovacuum.conf</code></td>
  <td>Default <code>pg_autovacuum</code> configuration file. This has no effect
  if the package <code>postgresql-contrib-</code><i>version</i> is not
  installed.</td>
</tr>

<tr>
  <td><code>/etc/postgresql/<i>version</i>/<i>clustername</i></code></td>
  <td>Cluster-specific configuration files: <code>postgresql.conf</code>,
  <code>pg_hba.conf</code>, <code>pg_ident.conf</code>, a symbolic link
  <code>pgdata</code> which points to the actual data directory, a symbolic
  link <code>log</code> which points to the postmaster log file, and a symbolic link
  <code>autovacuum_log</code> which points to the log file of the autovacuum
  daemon (started if <code>postgresql-contrib-</code><i>version</i> is
  installed). If this directory contains <code>autovacuum.conf</code>, this is
  used as the cluster specific autovacuum daemon configuration; if it does not
  exist, <code>/etc/postgresql-common/autovacuum.conf</code> is used as a
  fallback. If this directory contains <code>start.conf</code>, that file
  configures the startup mode of the cluster: <i>auto</i> (start/stop in init
  script), <i>manual</i> (do not start/stop in init script, but manual control
  with <code>pg_ctlcluster</code> is possible), <i>disabled</i>
  (<code>pg_ctlcluster</code> is not allowed).</td>
</tr>

</table>

<h3>Per-version files and programs</h3>

<table cellpadding="6" cellspacing="0" border="0">
<tr><td><code>/usr/lib/postgresql/<i>version</i></code></td> <td colspan="0" rowspan="3" valign="middle">files for a specific version</td></tr>
<tr><td><code>/usr/share/postgresql/<i>version</i></code></td></tr>
<tr><td><code>/usr/share/doc/postgresql/postgresql-doc-<i>version</i></code></td></tr>
</table>

<h3>Common programs</h3>
<table cellpadding="6" cellspacing="0" border="0">
<tr><td><code>/usr/share/postgresql-common/pg_wrapper</code></td> <td>environment chooser and program selector</td></tr>
<tr><td><code>/usr/bin/<i>program</i></code></td>  <td>symbolic links to pg_wrapper, for all client programs</td></tr>
<tr><td><code>/usr/bin/pg_lsclusters</code></td> <td>list all available clusters with their status and configuration</td></tr>
<tr><td><code>/usr/bin/pg_createcluster</code></td><td>wrapper for <code>initdb</code>, sets up the necessary configuration structure</td></tr>
<tr><td><code>/usr/bin/pg_ctlcluster</code></td><td>wrapper for <code>pg_ctl</code>, control the cluster <b>postmaster</b> server and <b>pg_autovacuum</b> daemon</td></tr>
<tr><td><code>/usr/bin/pg_upgradecluster</code></td><td>Upgrade a cluster to a newer major version.</td></tr>
<tr><td><code>/usr/bin/pg_dropcluster</code></td><td>remove a cluster and its configuration</td></tr>
</table>

<h3>psql</h3>

<p>We have abandoned the old non-standard error abort if a connection database
is not specified; psql is not expected to be run directly and all
connection parameters should be provided by pg_wrapper as specified above. In
addition, if no explicit default database is specified in
<code>user_clusters</code>, the default database will correspond to the user
name, thus reintroducing the default upstream behaviour.</p>

<!--
<h3>pg_wrapper</h3>

<p>pg_wrapper has been completely rewritten.  When called a pg_default, it 
allows a user to display his own connection choices, or to change them for
the current session, or for all sessions (by writing <code>~/.postgresqlrc</code>).
When called by root, it  allows user_clusters to be changed.
</p>
<p>When called as pg_exec, pg_wrapper can execute code from an
arbitrary version in order to connect to a remote machine.</p>

<p>When called as a link to any other name, that name is treated as a
client program and a path to the appropriate version of that program is
constructed and executed.</p>

<p>See the man pages for full details of the program's operation.</p>
-->

<h3>/etc/init.d/postgresql-<i>version</i></h3>

<p>This script now handles the postmaster server processes for each
cluster. However, most of the actual work is done by the new
<code>pg_ctlcluster</code> program.</p>

<h3>pg_upgradecluster</h3>

<p>This new program replaces postgresql-dump (a Debian specific program).</p>

<p>It is used to migrate a cluster from one major version to another.</p>

<p>Usage: <code>pg_upgradecluster</code> [<code>-v</code> <i>newversion</i>]
<i>version name</i> [<i>data_dir</i>]</p>

<p><code>-v</code> <i>version</i> specifies the version to upgrade to; defaults
to the newest available version.</p>

<p><i><a href="mailto:pkg-postgresql-public@lists.alioth.debian.org">The Debian
PostgreSQL developers</a> (Oliver Elphick, Martin Pitt)</i></p>

</body>
</html>