File: merge_request_diffs.md

package info (click to toggle)
gitlab 17.6.5-19
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 629,368 kB
  • sloc: ruby: 1,915,304; javascript: 557,307; sql: 60,639; xml: 6,509; sh: 4,567; makefile: 1,239; python: 406
file content (212 lines) | stat: -rw-r--r-- 7,358 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
---
stage: Create
group: Source Code
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
description: "Configure external storage for merge request diffs on your GitLab instance."
---

# Merge request diffs storage

DETAILS:
**Tier:** Free, Premium, Ultimate
**Offering:** Self-managed

Merge request diffs are size-limited copies of diffs associated with merge
requests. When viewing a merge request, diffs are sourced from these copies
wherever possible as a performance optimization.

By default, merge request diffs are stored in the database, in a table named
`merge_request_diff_files`. Larger installations may find this table grows too
large, in which case, switching to external storage is recommended.

Merge request diffs can be stored [on disk](#using-external-storage), or in
[object storage](#using-object-storage). In general, it
is better to store the diffs in the database than on disk. A compromise is available
that only [stores outdated diffs](#alternative-in-database-storage) outside of database.

## Using external storage

::Tabs

:::TabTitle Linux package (Omnibus)

1. Edit `/etc/gitlab/gitlab.rb` and add the following line:

   ```ruby
   gitlab_rails['external_diffs_enabled'] = true
   ```

1. The external diffs are stored in
   `/var/opt/gitlab/gitlab-rails/shared/external-diffs`. To change the path,
   for example, to `/mnt/storage/external-diffs`, edit `/etc/gitlab/gitlab.rb`
   and add the following line:

   ```ruby
   gitlab_rails['external_diffs_storage_path'] = "/mnt/storage/external-diffs"
   ```

1. Save the file and [reconfigure GitLab](restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect.
   GitLab then migrates your existing merge request diffs to external storage.

:::TabTitle Self-compiled (source)

1. Edit `/home/git/gitlab/config/gitlab.yml` and add or amend the following
   lines:

   ```yaml
   external_diffs:
     enabled: true
   ```

1. The external diffs are stored in
   `/home/git/gitlab/shared/external-diffs`. To change the path, for example,
   to `/mnt/storage/external-diffs`, edit `/home/git/gitlab/config/gitlab.yml`
   and add or amend the following lines:

   ```yaml
   external_diffs:
     enabled: true
     storage_path: /mnt/storage/external-diffs
   ```

1. Save the file and [restart GitLab](restart_gitlab.md#self-compiled-installations) for the changes to take effect.
   GitLab then migrates your existing merge request diffs to external storage.

::EndTabs

## Using object storage

WARNING:
Migrating to object storage is not reversible.

Instead of storing the external diffs on disk, we recommended the use of an object
store like AWS S3 instead. This configuration relies on valid AWS credentials to
be configured already.

::Tabs

:::TabTitle Linux package (Omnibus)

1. Edit `/etc/gitlab/gitlab.rb` and add the following line:

   ```ruby
   gitlab_rails['external_diffs_enabled'] = true
   ```

1. Set [object storage settings](#object-storage-settings).
1. Save the file and [reconfigure GitLab](restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect.
   GitLab then migrates your existing merge request diffs to external storage.

:::TabTitle Self-compiled (source)

1. Edit `/home/git/gitlab/config/gitlab.yml` and add or amend the following
   lines:

   ```yaml
   external_diffs:
     enabled: true
   ```

1. Set [object storage settings](#object-storage-settings).
1. Save the file and [restart GitLab](restart_gitlab.md#self-compiled-installations) for the changes to take effect.
   GitLab then migrates your existing merge request diffs to external storage.

::EndTabs

[Read more about using object storage with GitLab](object_storage.md).

### Object Storage Settings

You should use the
[consolidated object storage settings](object_storage.md#configure-a-single-storage-connection-for-all-object-types-consolidated-form).

## Alternative in-database storage

Enabling external diffs may reduce the performance of merge requests, as they
must be retrieved in a separate operation to other data. A compromise may be
reached by only storing outdated diffs externally, while keeping current diffs
in the database.

To enable this feature, perform the following steps:

::Tabs

:::TabTitle Linux package (Omnibus)

1. Edit `/etc/gitlab/gitlab.rb` and add the following line:

   ```ruby
   gitlab_rails['external_diffs_when'] = 'outdated'
   ```

1. Save the file and [reconfigure GitLab](restart_gitlab.md#reconfigure-a-linux-package-installation) for the changes to take effect.

:::TabTitle Self-compiled (source)

1. Edit `/home/git/gitlab/config/gitlab.yml` and add or amend the following
   lines:

   ```yaml
   external_diffs:
     enabled: true
     when: outdated
   ```

1. Save the file and [restart GitLab](restart_gitlab.md#self-compiled-installations) for the changes to take effect.

::EndTabs

With this feature enabled, diffs are initially stored in the database, rather
than externally. They are moved to external storage after any of these
conditions become true:

- A newer version of the merge request diff exists
- The merge request was merged more than seven days ago
- The merge request was closed more than seven day ago

These rules strike a balance between space and performance by only storing
frequently-accessed diffs in the database. Diffs that are less likely to be
accessed are moved to external storage instead.

## Switching from external storage to object storage

Automatic migration moves diffs stored in the database, but it does not move diffs between storage types.
To switch from external storage to object storage:

1. Move files stored on local or NFS storage to object storage manually.
1. Run this Rake task to change their location in the database.

   For Linux package installations:

   ```shell
   sudo gitlab-rake gitlab:external_diffs:force_object_storage
   ```

   For self-compiled installations:

   ```shell
   sudo -u git -H bundle exec rake gitlab:external_diffs:force_object_storage RAILS_ENV=production
   ```

   By default, `sudo` does not preserve existing environment variables. You should
   append them, rather than prefix them, like this:

   ```shell
   sudo gitlab-rake gitlab:external_diffs:force_object_storage START_ID=59946109 END_ID=59946109 UPDATE_DELAY=5
   ```

These environment variables modify the behavior of the Rake task:

| Name           | Default value | Purpose |
|----------------|---------------|---------|
| `ANSI`         | `true`        | Use ANSI escape codes to make output more understandable. |
| `BATCH_SIZE`   | `1000`        | Iterate through the table in batches of this size. |
| `START_ID`     | `nil`         | If set, begin scanning at this ID. |
| `END_ID`       | `nil`         | If set, stop scanning at this ID. |
| `UPDATE_DELAY` | `1`           | Number of seconds to sleep between updates. |

- `START_ID` and `END_ID` can be used to run the update in parallel,
  by assigning different processes to different parts of the table.
- `BATCH` and `UPDATE_DELAY` enable the speed of the migration to be traded off
  against concurrent access to the table.
- `ANSI` should be set to `false` if your terminal does not support ANSI escape codes.