1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362
|
---
stage: Data Stores
group: Database
info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/ee/development/development_processes.html#development-guidelines-review.
---
# List partition
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/96815) in GitLab 15.4.
## Description
Add the partitioning key column to the table you are partitioning.
Include the partitioning key in the following constraints:
- The primary key.
- All foreign keys referencing the table to be partitioned.
- All unique constraints.
## Example
### Step 1 - Add partition key
Add the partitioning key column. For example, in a rails migration:
```ruby
class AddPartitionNumberForPartitioning < Gitlab::Database::Migration[2.1]
enable_lock_retries!
TABLE_NAME = :table_name
COLUMN_NAME = :partition_id
DEFAULT_VALUE = 100
def change
add_column(TABLE_NAME, COLUMN_NAME, :bigint, default: 100)
end
end
```
### Step 2 - Create required indexes
Add indexes including the partitioning key column. For example, in a rails migration:
```ruby
class PrepareIndexesForPartitioning < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
TABLE_NAME = :table_name
INDEX_NAME = :index_name
def up
add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: INDEX_NAME)
end
def down
remove_concurrent_index_by_name(TABLE_NAME, INDEX_NAME)
end
end
```
### Step 3 - Enforce unique constraint
Change all unique indexes to include the partitioning key column,
including the primary key index. You can start by adding an unique
index on `[primary_key_column, :partition_id]`, which will be
required for the next two steps. For example, in a rails migration:
```ruby
class PrepareUniqueContraintForPartitioning < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
TABLE_NAME = :table_name
OLD_UNIQUE_INDEX_NAME = :index_name_unique
NEW_UNIQUE_INDEX_NAME = :new_index_name
def up
add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_UNIQUE_INDEX_NAME)
remove_concurrent_index_by_name(TABLE_NAME, OLD_UNIQUE_INDEX_NAME)
end
def down
add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_UNIQUE_INDEX_NAME)
remove_concurrent_index_by_name(TABLE_NAME, NEW_UNIQUE_INDEX_NAME)
end
end
```
### Step 4 - Enforce foreign key constraint
Enforce foreign keys including the partitioning key column. For example, in a rails migration:
```ruby
class PrepareForeignKeyForPartitioning < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
SOURCE_TABLE_NAME = :source_table_name
TARGET_TABLE_NAME = :target_table_name
COLUMN = :foreign_key_id
TARGET_COLUMN = :id
FK_NAME = :fk_365d1db505_p
PARTITION_COLUMN = :partition_id
def up
add_concurrent_foreign_key(
SOURCE_TABLE_NAME,
TARGET_TABLE_NAME,
column: [PARTITION_COLUMN, COLUMN],
target_column: [PARTITION_COLUMN, TARGET_COLUMN],
validate: false,
on_update: :cascade,
name: FK_NAME
)
# This should be done in a separate post migration when dealing with a high traffic table
validate_foreign_key(TABLE_NAME, [PARTITION_COLUMN, COLUMN], name: FK_NAME)
end
def down
with_lock_retries do
remove_foreign_key_if_exists(SOURCE_TABLE_NAME, name: FK_NAME)
end
end
end
```
The `on_update: :cascade` option is mandatory if we want the partitioning column
to be updated. This will cascade the update to all dependent rows. Without
specifying it, updating the partition column on the target table we would
result in a `Key is still referenced from table ...` error and updating the
partition column on the source table would raise a
`Key is not present in table ...` error.
### Step 5 - Swap primary key
Swap the primary key including the partitioning key column. This can be done only after
including the partition key for all references foreign keys. For example, in a rails migration:
```ruby
class PreparePrimaryKeyForPartitioning < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
TABLE_NAME = :table_name
PRIMARY_KEY = :primary_key
OLD_INDEX_NAME = :old_index_name
NEW_INDEX_NAME = :new_index_name
def up
swap_primary_key(TABLE_NAME, PRIMARY_KEY, NEW_INDEX_NAME)
end
def down
add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_INDEX_NAME)
add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_INDEX_NAME)
unswap_primary_key(TABLE_NAME, PRIMARY_KEY, OLD_INDEX_NAME)
# We need to add back referenced FKs if any, eg: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/113725/diffs
end
end
```
NOTE:
Do not forget to set the primary key explicitly in your model as `ActiveRecord` does not support composite primary keys.
```ruby
class Model < ApplicationRecord
self.primary_key = :id
end
```
### Step 6 - Create parent table and attach existing table as the initial partition
You can now create the parent table attaching the existing table as the initial
partition by using the following helpers provided by the database team.
For example, using list partitioning in Rails post migrations:
```ruby
class PrepareTableConstraintsForListPartitioning < Gitlab::Database::Migration[2.1]
include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers
disable_ddl_transaction!
TABLE_NAME = :table_name
PARENT_TABLE_NAME = :p_table_name
FIRST_PARTITION = 100
PARTITION_COLUMN = :partition_id
def up
prepare_constraint_for_list_partitioning(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION
)
end
def down
revert_preparing_constraint_for_list_partitioning(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION
)
end
end
```
```ruby
class ConvertTableToListPartitioning < Gitlab::Database::Migration[2.1]
include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers
disable_ddl_transaction!
TABLE_NAME = :table_name
PARENT_TABLE_NAME = :p_table_name
FIRST_PARTITION = 100
PARTITION_COLUMN = :partition_id
def up
convert_table_to_first_list_partition(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION
)
end
def down
revert_converting_table_to_first_list_partition(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION
)
end
end
```
NOTE:
Do not forget to set the sequence name explicitly in your model because it will
be owned by the routing table and `ActiveRecord` can't determine it. This can
be cleaned up after the `table_name` is changed to the routing table.
```ruby
class Model < ApplicationRecord
self.sequence_name = 'model_id_seq'
end
```
If the partitioning constraint migration takes [more than 10 minutes](../../migration_style_guide.md#how-long-a-migration-should-take) to finish,
it can be made to run asynchronously to avoid running the post-migration during busy hours.
Prepend the following migration `AsyncPrepareTableConstraintsForListPartitioning`
and use `async: true` option. This change marks the partitioning constraint as `NOT VALID`
and enqueues a scheduled job to validate the existing data in the table during the weekend.
Then the second post-migration `PrepareTableConstraintsForListPartitioning` only
marks the partitioning constraint as validated, because the existing data is already
tested during the previous weekend.
For example:
```ruby
class AsyncPrepareTableConstraintsForListPartitioning < Gitlab::Database::Migration[2.1]
include Gitlab::Database::PartitioningMigrationHelpers::TableManagementHelpers
disable_ddl_transaction!
TABLE_NAME = :table_name
PARENT_TABLE_NAME = :p_table_name
FIRST_PARTITION = 100
PARTITION_COLUMN = :partition_id
def up
prepare_constraint_for_list_partitioning(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION,
async: true
)
end
def down
revert_preparing_constraint_for_list_partitioning(
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
initial_partitioning_value: FIRST_PARTITION
)
end
end
```
### Step 7 - Re-point foreign keys to parent table
The tables that reference the initial partition must be updated to point to the
parent table now. Without this change, the records from those tables
will not be able to locate the rows in the next partitions because they will look
for them in the initial partition.
Steps:
- Add the foreign key to the partitioned table and validate it asynchronously,
[for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230524135543_replace_ci_build_pending_states_foreign_key.rb).
- Validate it synchronously after the asynchronously validation was completed on GitLab.com,
[for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230530140456_validate_fk_ci_build_pending_states_p_ci_builds.rb).
- Remove the old foreign key and rename the new one to the old name,
[for example](https://gitlab.com/gitlab-org/gitlab/-/blob/65d63f6a00196c3a7d59f15191920f271ab2b145/db/post_migrate/20230615083713_replace_old_fk_ci_build_pending_states_to_builds.rb#L9).
### Step 8 - Ensure ID uniqueness across partitions
All uniqueness constraints must include the partitioning key, so we can have
duplicate IDs across partitions. To solve this we enforce that only the database
can set the ID values and use a sequence to generate them because sequences are
guaranteed to generate unique values.
For example:
```ruby
class EnsureIdUniquenessForPCiBuilds < Gitlab::Database::Migration[2.1]
include Gitlab::Database::PartitioningMigrationHelpers::UniquenessHelpers
enable_lock_retries!
TABLE_NAME = :p_ci_builds
SEQ_NAME = :ci_builds_id_seq
def up
ensure_unique_id(TABLE_NAME, seq: SEQ_NAME)
end
def down
revert_ensure_unique_id(TABLE_NAME, seq: SEQ_NAME)
end
end
```
### Step 9 - Analyze the partitioned table and create new partitions
The autovacuum daemon does not process partitioned tables. It is necessary to
periodically run a manual `ANALYZE` to keep the statistics of the table hierarchy
up to date.
Models that implement `Ci::Partitionable` with `partitioned: true` option are
analyzed by default on a weekly basis. To enable this and create new partitions
you need to register the model in the [PostgreSQL initializer](https://gitlab.com/gitlab-org/gitlab/-/blob/b7f0e3f1bcd2ffc220768bbc373364151775ca8e/config/initializers/postgres_partitioning.rb).
### Step 10 - Update the application to use the partitioned table
Now that the parent table is ready, we can update the application to use it:
```ruby
class Model < ApplicationRecord
self.table_name = :partitioned_table
end
```
Depending on the model, it might be safer to use a [change management issue](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/16387).
|