File: list.md

package info (click to toggle)
gitlab 17.6.5-19
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 629,368 kB
  • sloc: ruby: 1,915,304; javascript: 557,307; sql: 60,639; xml: 6,509; sh: 4,567; makefile: 1,239; python: 406
file content (362 lines) | stat: -rw-r--r-- 11,556 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
stage: Data Stores
group: Database
info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/ee/development/development_processes.html#development-guidelines-review.
---

# List partition

> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/96815) in GitLab 15.4.

## Description

Add the partitioning key column to the table you are partitioning.
Include the partitioning key in the following constraints:

- The primary key.
- All foreign keys referencing the table to be partitioned.
- All unique constraints.

## Example

### Step 1 - Add partition key

Add the partitioning key column. For example, in a rails migration:

```ruby
class AddPartitionNumberForPartitioning < Gitlab::Database::Migration[2.1]
  enable_lock_retries!

  TABLE_NAME = :table_name
  COLUMN_NAME = :partition_id
  DEFAULT_VALUE = 100

  def change
    add_column(TABLE_NAME, COLUMN_NAME, :bigint, default: 100)
  end
end
```

### Step 2 - Create required indexes

Add indexes including the partitioning key column. For example, in a rails migration:

```ruby
class PrepareIndexesForPartitioning < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  TABLE_NAME = :table_name
  INDEX_NAME = :index_name

  def up
    add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: INDEX_NAME)
  end

  def down
    remove_concurrent_index_by_name(TABLE_NAME, INDEX_NAME)
  end
end
```

### Step 3 - Enforce unique constraint

Change all unique indexes to include the partitioning key column,
including the primary key index. You can start by adding an unique
index on `[primary_key_column, :partition_id]`, which will be
required for the next two steps. For example, in a rails migration:

```ruby
class PrepareUniqueContraintForPartitioning < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  TABLE_NAME = :table_name
  OLD_UNIQUE_INDEX_NAME = :index_name_unique
  NEW_UNIQUE_INDEX_NAME = :new_index_name

  def up
    add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_UNIQUE_INDEX_NAME)

    remove_concurrent_index_by_name(TABLE_NAME, OLD_UNIQUE_INDEX_NAME)
  end

  def down
    add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_UNIQUE_INDEX_NAME)

    remove_concurrent_index_by_name(TABLE_NAME, NEW_UNIQUE_INDEX_NAME)
  end
end
```

### Step 4 - Enforce foreign key constraint

Enforce foreign keys including the partitioning key column. For example, in a rails migration:

```ruby
class PrepareForeignKeyForPartitioning < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  SOURCE_TABLE_NAME = :source_table_name
  TARGET_TABLE_NAME = :target_table_name
  COLUMN = :foreign_key_id
  TARGET_COLUMN = :id
  FK_NAME = :fk_365d1db505_p
  PARTITION_COLUMN = :partition_id

  def up
    add_concurrent_foreign_key(
      SOURCE_TABLE_NAME,
      TARGET_TABLE_NAME,
      column: [PARTITION_COLUMN, COLUMN],
      target_column: [PARTITION_COLUMN, TARGET_COLUMN],
      validate: false,
      on_update: :cascade,
      name: FK_NAME
    )

    # This should be done in a separate post migration when dealing with a high traffic table
    validate_foreign_key(TABLE_NAME, [PARTITION_COLUMN, COLUMN], name: FK_NAME)
  end

  def down
    with_lock_retries do
      remove_foreign_key_if_exists(SOURCE_TABLE_NAME, name: FK_NAME)
    end
  end
end
```

The `on_update: :cascade` option is mandatory if we want the partitioning column
to be updated. This will cascade the update to all dependent rows. Without
specifying it, updating the partition column on the target table we would
result in a `Key is still referenced from table ...` error and updating the
partition column on the source table would raise a
`Key is not present in table ...` error.

### Step 5 - Swap primary key

Swap the primary key including the partitioning key column. This can be done only after
including the partition key for all references foreign keys. For example, in a rails migration:

```ruby
class PreparePrimaryKeyForPartitioning < Gitlab::Database::Migration[2.1]
  disable_ddl_transaction!

  TABLE_NAME = :table_name
  PRIMARY_KEY = :primary_key
  OLD_INDEX_NAME = :old_index_name
  NEW_INDEX_NAME = :new_index_name

  def up
    swap_primary_key(TABLE_NAME, PRIMARY_KEY, NEW_INDEX_NAME)
  end

  def down
    add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_INDEX_NAME)
    add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_INDEX_NAME)

    unswap_primary_key(TABLE_NAME, PRIMARY_KEY, OLD_INDEX_NAME)

    # We need to add back referenced FKs if any, eg: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/113725/diffs
  end
end
```

NOTE:
Do not forget to set the primary key explicitly in your model as `ActiveRecord` does not support composite primary keys.

```ruby
class Model < ApplicationRecord
  self.primary_key = :id
end
```

### Step 6 - Create parent table and attach existing table as the initial partition

You can now create the parent table attaching the existing table as the initial
partition by using the following helpers provided by the database team.

For example, using list partitioning in Rails post migrations:

```ruby
class PrepareTableConstraintsForListPartitioning < Gitlab::Database::Migration[2.1]
  include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers

  disable_ddl_transaction!

  TABLE_NAME = :table_name
  PARENT_TABLE_NAME = :p_table_name
  FIRST_PARTITION = 100
  PARTITION_COLUMN = :partition_id

  def up
    prepare_constraint_for_list_partitioning(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION
    )
  end

  def down
    revert_preparing_constraint_for_list_partitioning(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION
    )
  end
end
```

```ruby
class ConvertTableToListPartitioning < Gitlab::Database::Migration[2.1]
  include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers

  disable_ddl_transaction!

  TABLE_NAME = :table_name
  PARENT_TABLE_NAME = :p_table_name
  FIRST_PARTITION = 100
  PARTITION_COLUMN = :partition_id

  def up
    convert_table_to_first_list_partition(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION
    )
  end

  def down
    revert_converting_table_to_first_list_partition(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION
    )
  end
end
```

NOTE:
Do not forget to set the sequence name explicitly in your model because it will
be owned by the routing table and `ActiveRecord` can't determine it. This can
be cleaned up after the `table_name` is changed to the routing table.

```ruby
class Model < ApplicationRecord
  self.sequence_name = 'model_id_seq'
end
```

If the partitioning constraint migration takes [more than 10 minutes](../../migration_style_guide.md#how-long-a-migration-should-take) to finish,
it can be made to run asynchronously to avoid running the post-migration during busy hours.

Prepend the following migration `AsyncPrepareTableConstraintsForListPartitioning`
and use `async: true` option. This change marks the partitioning constraint as `NOT VALID`
and enqueues a scheduled job to validate the existing data in the table during the weekend.

Then the second post-migration `PrepareTableConstraintsForListPartitioning` only
marks the partitioning constraint as validated, because the existing data is already
tested during the previous weekend.

For example:

```ruby
class AsyncPrepareTableConstraintsForListPartitioning < Gitlab::Database::Migration[2.1]
  include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers

  disable_ddl_transaction!

  TABLE_NAME = :table_name
  PARENT_TABLE_NAME = :p_table_name
  FIRST_PARTITION = 100
  PARTITION_COLUMN = :partition_id

  def up
    prepare_constraint_for_list_partitioning(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION,
      async: true
    )
  end

  def down
    revert_preparing_constraint_for_list_partitioning(
      table_name: TABLE_NAME,
      partitioning_column: PARTITION_COLUMN,
      parent_table_name: PARENT_TABLE_NAME,
      initial_partitioning_value: FIRST_PARTITION
    )
  end
end
```

### Step 7 - Re-point foreign keys to parent table

The tables that reference the initial partition must be updated to point to the
parent table now. Without this change, the records from those tables
will not be able to locate the rows in the next partitions because they will look
for them in the initial partition.

Steps:

- Add the foreign key to the partitioned table and validate it asynchronously,
  [for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230524135543_replace_ci_build_pending_states_foreign_key.rb).
- Validate it synchronously after the asynchronously validation was completed on GitLab.com,
  [for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230530140456_validate_fk_ci_build_pending_states_p_ci_builds.rb).
- Remove the old foreign key and rename the new one to the old name,
  [for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230615083713_replace_old_fk_ci_build_pending_states_to_builds.rb#L9).

### Step 8 - Ensure ID uniqueness across partitions

All uniqueness constraints must include the partitioning key, so we can have
duplicate IDs across partitions. To solve this we enforce that only the database
can set the ID values and use a sequence to generate them because sequences are
guaranteed to generate unique values.

For example:

```ruby
class EnsureIdUniquenessForPCiBuilds < Gitlab::Database::Migration[2.1]
  include Gitlab::Database::PartitioningMigrationHelpers::UniquenessHelpers

  enable_lock_retries!

  TABLE_NAME = :p_ci_builds
  SEQ_NAME = :ci_builds_id_seq

  def up
    ensure_unique_id(TABLE_NAME, seq: SEQ_NAME)
  end

  def down
    revert_ensure_unique_id(TABLE_NAME, seq: SEQ_NAME)
  end
end
```

### Step 9 - Analyze the partitioned table and create new partitions

The autovacuum daemon does not process partitioned tables. It is necessary to
periodically run a manual `ANALYZE` to keep the statistics of the table hierarchy
up to date.

Models that implement `Ci::Partitionable` with `partitioned: true` option are
analyzed by default on a weekly basis. To enable this and create new partitions
you need to register the model in the [PostgreSQL initializer](https://gitlab.com/gitlab-org/gitlab/-/blob/b7f0e3f1bcd2ffc220768bbc373364151775ca8e/config/initializers/postgres_partitioning.rb).

### Step 10 - Update the application to use the partitioned table

Now that the parent table is ready, we can update the application to use it:

```ruby
class Model < ApplicationRecord
  self.table_name = :partitioned_table
end
```

Depending on the model, it might be safer to use a [change management issue](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/16387).