1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348
|
---
stage: Data Stores
group: Tenant Scale
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
---
# Multiple Databases
DETAILS:
**Tier:** Free, Premium, Ultimate
**Offering:** Self-managed
> - [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/6168) in GitLab 15.7.
WARNING:
This feature is not ready for production use
By default, GitLab uses a single application database, referred to as the `main` database.
To scale GitLab, you can configure GitLab to use multiple application databases.
Due to [known issues](#known-issues), configuring GitLab with multiple databases is in limited [beta](../../policy/experiment-beta-support.md#beta).
After you have set up multiple databases, GitLab uses a second application database for
[CI/CD features](../../ci/index.md), referred to as the `ci` database. We do not exclude hosting both databases on a single PostgreSQL instance.
All tables have exactly the same structure in both the `main`, and `ci`
databases. Some examples:
- When multiple databases are configured, the `ci_pipelines` table exists in
both the `main` and `ci` databases, but GitLab reads and writes only to the
`ci_pipelines` table in the `ci` database.
- Similarly, the `projects` table exists in
both the `main` and `ci` databases, but GitLab reads and writes only to the
`projects` table in the `main` database.
- For some tables (such as `loose_foreign_keys_deleted_records`) GitLab reads and writes to both the `main` and `ci` databases. See the
[development documentation](../../development/database/multiple_databases.md#gitlab-schema)
## Known issues
- Once data is migrated to the `ci` database, you cannot migrate it back.
- Significant downtime is expected for larger installations (database sizes of more 100 GB).
- Running two databases [is not yet supported with Geo](https://gitlab.com/groups/gitlab-org/-/epics/8631).
## Migrate existing installations using a script
> - A script for migrating existing Linux package installations was [introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/368729) in GitLab 16.8.
### Existing Linux package installations
This migration requires downtime.
If something unexpected happens during the migration, it is safe to start over.
#### Preparation
1. Verify available disk space:
- The database node that will store the `gitlabhq_production_ci` database needs enough space to store a copy of the existing database: we _duplicate_ `gitlabhq_production`. Run the following SQL query to find out how much space is needed. Add 25%, to ensure you will not run out of disk space.
```shell
sudo gitlab-psql -c "SELECT pg_size_pretty( pg_database_size('gitlabhq_production') );"
```
- During the process, a dump of the `gitlabhq_production` database needs to be temporarily stored on the filesystem of the node that will run the migration. Execute the following SQL statement to find out how much local disk space will be used. Add 25%, to ensure you will not run out of disk space.
```shell
sudo gitlab-psql -c "select sum(pg_table_size(concat(table_schema,'.',table_name))) from information_schema.tables where table_catalog = 'gitlabhq_production' and table_type = 'BASE TABLE'"
```
1. Plan for downtime. The downtime is dependent on the size of the `gitlabhq_production` database.
- We dump `gitlabhq_production` and restore it into a new `gitlabhq_production_ci` database. Database sizes below 50 GB should be done within 30 minutes. Larger databases need more time. For example, a 100 GB database needs 1-2 hours to be copied to the new database.
- We advise to also plan some time for smaller tasks like modifying the configuration.
1. Create the new `gitlabhq_production_ci` database:
```shell
sudo gitlab-psql -c "CREATE DATABASE gitlabhq_production_ci WITH OWNER 'gitlab'"
```
#### Migration
This process includes downtime. Running the migration script will stop the GitLab instance. After the migration has been finished, the instance is restarted.
1. Create a backup of the configuration:
```shell
sudo cp /etc/gitlab/gitlab.rb /etc/gitlab/gitlab.rb.org
```
1. Edit `/etc/gitlab/gitlab.rb` and save the changes. Do **not** run the reconfigure command, the migration script will run that for you.
```ruby
gitlab_rails['env'] = { 'GITLAB_ALLOW_SEPARATE_CI_DATABASE' => 'true' }
gitlab_rails['databases']['ci']['enable'] = true
gitlab_rails['databases']['ci']['db_database'] = 'gitlabhq_production_ci'
```
1. Run the migration script:
```shell
sudo gitlab-ctl pg-decomposition-migration
```
At this point, the GitLab instance should start and be functional.
If you want to abort the procedure and you want to start GitLab without changing anything, run the following commands:
```shell
sudo cp /etc/gitlab/gitlab.rb.org /etc/gitlab/gitlab.rb
sudo gitlab-ctl reconfigure
sudo gitlab-ctl restart
```
#### Cleaning up
If everything works as expected, we can clean up unneeded data:
- Delete the CI data in Main database:
```shell
sudo gitlab-rake gitlab:db:truncate_legacy_tables:main
```
- Delete the Main data in CI database:
```shell
sudo gitlab-rake gitlab:db:truncate_legacy_tables:ci
```
## Migrate existing installations (manual procedure)
To migrate existing data from the `main` database to the `ci` database, you can
copy the database across.
NOTE:
If something unexpected happens during the migration, it is safe to start over.
### Existing self-compiled installation
1. [Disable background migrations](../../development/database/batched_background_migrations.md#enable-or-disable-background-migrations).
1. [Ensure all background migrations are finished](../../update/background_migrations.md#check-the-status-of-batched-background-migrations).
1. Stop GitLab, except for PostgreSQL:
```shell
sudo service gitlab stop
sudo service postgresql start
```
1. Dump the `main` database:
```shell
sudo -u git pg_dump -f gitlabhq_production.sql gitlabhq_production
```
1. Create the `ci` database, and copy the data from the previous dump:
```shell
sudo -u postgres psql -d template1 -c "CREATE DATABASE gitlabhq_production_ci OWNER git;"
sudo -u git psql -f gitlabhq_production.sql gitlabhq_production_ci
```
1. Configure GitLab to [use multiple databases](#set-up-multiple-databases).
### Existing Linux package installations
1. [Disable background migrations](../../development/database/batched_background_migrations.md#enable-or-disable-background-migrations)
1. [Ensure all background migrations are finished](../../update/background_migrations.md#check-the-status-of-batched-background-migrations)
1. Stop GitLab, except for PostgreSQL:
```shell
sudo gitlab-ctl stop
sudo gitlab-ctl start postgresql
```
1. Dump the `main` database:
```shell
sudo -u gitlab-psql /opt/gitlab/embedded/bin/pg_dump -h /var/opt/gitlab/postgresql -f gitlabhq_production.sql gitlabhq_production
```
1. Create the `ci` database, and copy the data from the previous dump:
```shell
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql -d template1 -c "CREATE DATABASE gitlabhq_production_ci OWNER gitlab;"
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql -f gitlabhq_production.sql gitlabhq_production_ci
```
1. Configure GitLab to [use multiple databases](#set-up-multiple-databases).
### Existing Linux package installations using streaming replication
To reduce downtime, you can set up streaming replication to migrate existing data from the `main` database to the `ci` database.
This procedure results in two database clusters.
This procedure can be both time- and resource-consuming.
Consider their trade-offs with availability before executing it.
To set up streaming replication for creating two database clusters:
1. Set up streaming replication from the GitLab database to new database instance.
1. When the new replica has caught up, [disable background migrations](../../development/database/batched_background_migrations.md#enable-or-disable-background-migrations).
1. [Ensure all background migrations are finished](../../update/background_migrations.md#check-the-status-of-batched-background-migrations).
1. Stop GitLab, except for PostgreSQL:
```shell
sudo gitlab-ctl stop
sudo gitlab-ctl start postgresql
```
1. After the replication is complete, stop the streaming replication, and promote the replica to a primary instance.
You now have two database clusters, one for `main`, and one for `ci`.
1. Configure GitLab to [use multiple databases](#set-up-multiple-databases).
For more information on how to set up Streaming Replication,
see [PostgreSQL replication and failover for Linux package installations](replication_and_failover.md).
## Set up multiple databases
To configure GitLab to use multiple application databases, follow the instructions below for your installation type.
WARNING:
You must stop GitLab before setting up multiple databases. This prevents
split-brain situations, where `main` data is written to the `ci` database, and
the other way around.
### Self-compiled installations
1. For existing installations,
[migrate the data](#migrate-existing-installations-manual-procedure) first.
1. [Back up GitLab](../../administration/backup_restore/index.md)
in case of unforeseen issues.
1. Stop GitLab:
```shell
sudo service gitlab stop
```
1. Open `config/database.yml`, and add a `ci:` section under
`production:`. See `config/database.yml.decomposed-postgresql` for possible
values for this new `ci:` section. Once modified, the `config/database.yml` should
look like:
```yaml
production:
main:
# ...
ci:
adapter: postgresql
encoding: unicode
database: gitlabhq_production_ci
# ...
```
1. Save the `config/database.yml` file.
1. Update the service files to set the `GITLAB_ALLOW_SEPARATE_CI_DATABASE`
environment variable to `true`.
1. For new installations only. Create the `gitlabhq_production_ci` database:
```shell
sudo -u postgres psql -d template1 -c "CREATE DATABASE gitlabhq_production OWNER git;"
sudo -u git -H bundle exec rake db:schema:load:ci
```
1. Lock writes for `ci` tables in `main` database, and the other way around:
```shell
sudo -u git -H bundle exec rake gitlab:db:lock_writes
```
1. Restart GitLab:
```shell
sudo service gitlab restart
```
1. [Enable background migrations](../../development/database/batched_background_migrations.md#enable-or-disable-background-migrations)
### Linux package installations
1. For existing installations,
[migrate the data](#migrate-existing-installations-manual-procedure) first.
1. [Back up GitLab](../../administration/backup_restore/index.md)
in case of unforeseen issues.
1. Stop GitLab:
```shell
sudo gitlab-ctl stop
```
1. Edit `/etc/gitlab/gitlab.rb` and add the following lines:
```ruby
gitlab_rails['env'] = { 'GITLAB_ALLOW_SEPARATE_CI_DATABASE' => 'true' }
gitlab_rails['databases']['ci']['enable'] = true
gitlab_rails['databases']['ci']['db_database'] = 'gitlabhq_production_ci'
```
1. Save the `/etc/gitlab/gitlab.rb` file.
1. Reconfigure GitLab:
```shell
sudo gitlab-ctl reconfigure
```
1. Optional, for new installations only. Reconfiguring GitLab should create the
`gitlabhq_production_ci` database if it does not exist. If the database is not created automatically, create it manually:
```shell
sudo gitlab-ctl start postgresql
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql -d template1 -c "CREATE DATABASE gitlabhq_production_ci OWNER gitlab;"
sudo gitlab-rake db:schema:load:ci
```
1. Lock writes for `ci` tables in `main` database, and the other way around:
```shell
sudo gitlab-ctl start postgresql
sudo gitlab-rake gitlab:db:lock_writes
```
1. Restart GitLab:
```shell
sudo gitlab-ctl restart
```
1. [Enable background migrations](../../development/database/batched_background_migrations.md#enable-or-disable-background-migrations)
## Further information
For more information on multiple databases, see [issue 6168](https://gitlab.com/groups/gitlab-org/-/epics/6168).
For more information on how multiple databases work in GitLab, see the [development guide for multiple databases](../../development/database/multiple_databases.md).
Since 2022-07-02, GitLab.com has been running with two separate databases. For more information, see this [blog post](https://about.gitlab.com/blog/2022/06/02/splitting-database-into-main-and-ci/).
|