File: README.md

package info (click to toggle)
pg-rewrite 2.0.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 348 kB
  • sloc: ansic: 3,132; sql: 199; makefile: 17; sh: 2
file content (277 lines) | stat: -rw-r--r-- 11,494 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
# pg_rewrite

`pg_rewrite` is a tool to rewrite table (i.e. to copy its data to a new
file). It allows both read and write access to the table during the rewriting.

Following are the most common reasons to rewrite a table:

1.  Change data type of column(s)

    Typically this is needed if the existing data type is running out of
    values. For example, you may need to change `interger` type to
    `bigint`. `ALTER TABLE` command can do that too, but it allows neither
    write nor read access to the table during the rewriting.

2.  Partition the table

    If you realize that your table is getting much bigger than expected and
    that partitioning would make your life easier, the next question may be
    how to copy the existing data to the new, partitioned table without
    stopping all the applications that run DML commands on the table. (When
    you decide to use partitioning, the amount of data to copy might already
    be significant, so the copying might need a while.)

3.  Change order of columns

    If you conclude that a different order of columns would save significant
    disk space (due to reduced paddding), the problem boils down to copying
    data to a new table like in 2). Again, you may need `pg_rewrite` to make
    the change smooth.

4.  Move table into another tablespace.

    `ALTER TABLE` command can do that, but it allows neither write nor read
    access to the table during the rewriting. With `pg_rewrite`, you only need
    to create the new table in the desired tablespace. The rest is identical
    to the other use cases.

Note that the following use cases can be combined in a single rewrited.


# INSTALLATION

Install PostgreSQL before proceeding. Make sure to have `pg_config` binary,
these are typically included in `-dev` and `-devel` packages. PostgreSQL server
version 13 or later is required.

```bash
git clone https://github.com/cybertec-postgresql/pg_rewrite.git
cd pg_rewrite
git checkout <the latest stable version>
make
make install
```

Add these to `postgresql.conf`:

```
wal_level = logical
max_replication_slots = 1 # ... or add 1 to the current value.
shared_preload_libraries = 'pg_rewrite' # ... or add the library to the existing ones.
```

Restart the cluster, and invoke:

```
CREATE EXTENSION pg_rewrite;
```

# USAGE

Assume you have a table defined like this

```
CREATE TABLE measurement (
    id              int,
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    PRIMARY KEY(id, logdate)
);
```

and you need to replace it with a partitioned table. At the same time, you
want to change the data type of the `id` column to `bigint`.


```
CREATE TABLE measurement_aux (
    id              bigint,
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    PRIMARY KEY(id, logdate)
) PARTITION BY RANGE (logdate);
```

Then create partitions for all the rows currently present in the `measurement`
table, and also for the data that might be inserted during processing:

```
CREATE TABLE measurement_y2006m02 PARTITION OF measurement_aux
    FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');

CREATE TABLE measurement_y2006m03 PARTITION OF measurement_aux
    FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

-- ...
```

*It's essential that both the source (`measurement`) and target
(`measurement_aux`) table have an identity index. It is needed to process data
changes that applications make while data is being copied from the source to
the target table. If the replica identity of the table is DEFAULT or FULL,
primary key constraint provides the identity index. If your table has no
primary key, you need to set the identity index explicitly using the [ALTER
COMMAND ... REPLICA IDENTITY USING INDEX ...][1] command.

Also note that the key (i.e. column list) of the identity index of the source
and target table must be identical.*

Then, in order to copy the data into the target table, run the
`rewrite_table()` function and pass it both the source and target table, as
well as a new table name for the source table. For example:

```
SELECT rewrite_table('measurement', 'measurement_aux', 'measurement_old');
```

The call will first copy all rows from `measurement` to `measurement_aux`. Then
it will apply to `measurement_aux` all the data changes (INSERT, UPDATE,
DELETE) that took place in `measurement` during the copying. Next, it will
lock `measurement` so that neither read nor write access is possible. Finally
it will rename `measurement` to `measurement_old` and `measurement_aux` to
`measurement`. Thus `measurement` ends up to be the partitioned table, while
`measurement_old` is the original, non-partitioned table.

If a column of the target table has a different data type from the
corresponding column of the source table, an implicit or assignment cast must
exist between the two types.

# Constraints

The target table should obviously end up with the same constraints as the
source table. It's recommended to handle constraints creation this way:

1.  Add PRIMARY KEY, UNIQUE and EXCLUDE constraints of the source table to the
    target table before you call `rewrite_table()`. These are enforced during
    the rewriting, so any violation would make `rewrite_table()` fail
    (ROLLBACK). (The constraints must have been enforced in the source table,
    but it does not hurt to check them in the target table, especially if the
    column data type is being changed.)

2.  If the version of PostgreSQL server is 17 or lower, add NOT NULL
    constraints of the source table to the target table. `rewrite_table()`
    by-passes validation of these, but all the rows it inserts into the target
    table must have been validated in the source table. Even if the column
    data tape is different in the target table, the data type conversion
    should not turn non-NULL value to NULL or vice versa.

3.  CHECK constraints are created automatically by `rewrite_table()`
    (according to the source table) when all the data changes have been
    applied to the target table. However, these constraints are created as NOT
    VALID, so you need to use the `ALTER TABLE ... VALIDATE CONSTRAINT ...`
    command to validate them.

    (The function does not create these constraints immediately as valid,
    because that could imply blocking access to the table for significant
    time.)

4.  If the version of PostgreSQL server is 18 or higher, NOT NULL constraints
    are also created automatically and need to be validated using the `ALTER
    TABLE ... VALIDATE CONSTRAINT ...` command.

5.  FOREIGN KEY constraints are also created automatically (according to the
    source table) and need to be validated using the `ALTER TABLE ... VALIDATE
    CONSTRAINT ...` command, unless the referencing table is partitioned and
    the version of PostgreSQL server is 17 or lower: those versions do not
    support the NOT VALID option for partitioned tables.

    Therefore, if the referencing table is partitioned and if the server
    version is 17 or lower, you need to use the `ALTER TABLE ... ADD
    CONSTRAINT ... FOREIGN KEY ...` command after `rewrite_table()` has
    finished. Please run the command as soon as possible to minimize the risk
    that applications modify the database in a way that violates the
    constraints.

6.  Drop all foreign keys involving the source table.

    You probably want to drop the source table anyway, but if you don't, you
    should at least drop its FOREIGN KEY constraints. As the table was
    renamed, applications will no longer update it. Therefore, attempts to
    update the other tables involved in its foreign keys may cause errors.

# Sequences

If a sequence is used to generate column value in the source table (typically
the column data type is `serial` or the column is declared `GENARATED ... AS
IDENTITY`), and if `rewrite_table()` finds the corresponding sequence for the
target table, it sets its value according to the sequence for the source
table. If it cannot identify the sequence for the target table, a log message
is printed out.

# Progress monitoring

If `rewrite_table()` takes long time to finish, you might be interested in the
progress. The `pg_rewrite_progress` view shows all the pending calls of the
function in the current database. The `src_table`, `dst_table` and
`src_table_new` columns contain the arguments of the `rewrite_table()`
function. `ins_initial` is the number of tuples inserted into the new table
storage during the "initial load stage", i.e. the number of tuples present in
the table before the processing started. On the other hand, `ins`, `upd` and
`del` are the numbers of tuples inserted, updated and deleted by applications
during the table processing. (These "concurrent data changes" must also be
incorporated into the partitioned table, otherwise they'd get lost.)

# Limitations

1.  If the target table is partitioned, it's not allowed to have foreign
    tables as partitions.

2.  Indexes are not renamed.

    While the target table (`measurement_aux` above) is renamed to the source
    table (`measurement`), its indexes are not renamed to match the source
    table. If you consider it a problem, please use the `ALTER INDEX` command
    to rename them. This operation blocks neither reads nor writes.

# Configuration

Following is the description of the configuration variables that affect
behavior of the functions of this extension.

* `rewrite.max_xlock_time`

Although the table being processed is available for both read and write
operations by other transactions most of the time, an exclusive lock is needed
to finalize the processing (i.e. to do the table renaming), which blocks both
read and write access. This should take very short time that users should
harly notice.

However, if a significant amount of changes took place in the source table
while the extension was waiting for the (exclusive) lock, the outage might
take proportionally longer time. The point is that those changes need to be
propagated to the target table before the exclusive lock can be released.

If the extension function seems to block access to tables too much, consider
setting `rewrite.max_xlock_time` GUC parameter. For example:

```
SET rewrite.max_xlock_time TO 100;
```

Tells that the exclusive lock shouldn't be held for more than 0.1 second (100
milliseconds). If more time is needed for the final stage, the particular
function releases the exclusive lock, processes the changes committed by the
other transactions in between and tries the final stage again. Error is
reported if the lock duration is exceeded a few more times. If that happens,
you should either increase the setting or try to process the problematic table
later, when the write activity is lower.

The default value is `0`, meaning that the final stage can take as much time as
it needs.

# Concurrency

1. While the rewrite_table() function is executing, `ALTER TABLE` command on
   the same table should be blocked until the rewriting is done. However, in
   some cases the `ALTER TABLE` command and the rewrite_table() function might
   end up in a deadlock. Therefore it's recommended not to run ALTER TABLE on
   a table which is being rewritten.

2. The `rewrite_table()` function allows for MVCC-unsafe behavior described in
   the first paragraph of [mvcc-caveats][2].


[1] https://www.postgresql.org/docs/17/sql-altertable.html
[2] https://www.postgresql.org/docs/current/mvcc-caveats.html