File: migration.md

package info (click to toggle)
rust-virtiofsd 1.13.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,020 kB
  • sloc: sh: 79; makefile: 14
file content (265 lines) | stat: -rw-r--r-- 13,726 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
Migration with virtio-fs
========================

virtiofsd supports migration through [vhost-user’s device state
interface](https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#migrating-back-end-state),
allowing it to place internal state into the vhost-user front-end’s (e.g.
QEMU’s) migration stream.  This allows it to transfer information about files
and directories the guest has open to the destination instance.  It is very
important to note however that virtiofsd never migrates any data, i.e. source
and destination are expected to export shared directories with matching
contents (e.g. by using the same directory on the same filesystem).

If you do not care about any of the details, feel free to skip ahead to the
[section explaining recommended configurations](#recommended-configurations).

Filesystem State Requirements
-----------------------------

As just mentioned, virtiofsd does not migrate any filesystem data, and provides
no facilities to do so.  The user is responsible for ensuring that the shared
directories used by the source and destination instances of virtiofsd have the
same content.  Specifically, they must have the same content during switch-over,
once execution is stopped on the source, until it is resumed on the destination.

One way to achieve this is to use the same directory on the same filesystem for
both instances, e.g. by using a shared network filesystem.  If that is not
possible, the contents of the shared directory must be copied (outside of QEMU)
from the source to the destination during the switch-over phase.  This may be
reasonably feasible for a read-only use case, where copying can take place long
in advance of the actual migration.

Snapshots
---------

Because virtiofsd embeds its state into the front-end’s migration stream, it is
possible to store this stream somewhere to restore it later, i.e. in a
snapshot.  From a technical perspective, this is perfectly fine, but it must
again be stressed that virtiofsd’s state includes absolutely no data;
therefore, some mechanism outside of virtio-fs/virtiofsd must be used to ensure
that when restoring such a snapshot, the shared directory is in exactly the
same state as it was when the snapshot was taken.

What Needs to Be Migrated Anyway?
---------------------------------

For every file or directory that is open in the guest, virtiofsd has a
corresponding file descriptor (FD) open in the shared directory.  The
destination instance must restore these FDs, so the source instance must provide
instructions on how to do so.

The same applies to files and directories the guest does not really have open,
but still has their directory entries cached; through FUSE, the guest kernel can
reference all such cached entries by associated integer IDs.  Therefore,
virtiofsd needs to have an internal map that can convert each ID into
something that strongly references its associated filesystem object;
specifically, either an `O_PATH` FD or a file handle, depending on the
`--inode-file-handles` setting.  These too need to be transferred in some manner
to the destination.

Migration Modes
---------------

There are two general ways virtiofsd’s internal state can be serialized and
migrated, [by path](#by-path---migration-modefind-paths) or [as file
handles](#as-file-handles---migration-modefile-handles).

### By Path (`--migration-mode=find-paths`)

For every filesystem object that must be transferred to the destination,
virtiofsd tries to find its path inside of the shared directory, and transmits
that to the destination, which then opens it.

Because paths can change, this mode can be quite brittle.  virtiofsd begins
collecting paths once migration starts (long before the switch-over phase), so
any changes to those paths afterwards can lead to various problems, especially
if those changes are done by third parties outside of the VM guest.

Some examples for such changes are:

#### Unlinking

Files can exist without paths, specifically when they’re opened but unlinked.
Consequently, such files (that may be open in the guest) cannot be migrated
using paths.  When migrating anyway, the file contents will be lost once the
source instance is quit.

Note that for files for which virtiofsd cannot find a path, migration will
produce an error.  The error response behavior is controlled via the destination
instance’s `--migration-on-error` switch; `abort` will abort migration (on the
destination) when any error occurs, allowing execution to be resumed on the
source side, with any FD still open.  `guest-error` will continue migration,
marking any file that could not be migrated as faulty, returning errors for any
guest accesses.

#### Renaming / Moving

When files or directories are renamed or moved by the migrating guest, virtiofsd
is naturally aware of this, and so can update the paths it holds internally.

This is not the case when paths are changed outside of virtiofsd, by third
parties.  In this case, virtiofsd will remain unaware and will send the outdated
path to the destination, which will not be able to resolve it (error behavior is
then controlled by the `--migration-on-error` switch, as described in the
[Unlinking](#unlinking) section).

In contrast to the *unlinking* case, it would at least theoretically be possible
to migrate these files using their new paths, if virtiofsd somehow could get
notified of the rename/move.  The `--migration-confirm-paths` option has it
double-check each collected path at switch-over time, and so may be able to
detect such moves and renames in many cases (but does so on the source side, so
still has a non-empty TOCTTOU window).

#### Replacing

In the *renaming / moving* case, the worst thing that can happen is that a file
the guest has open is no longer accessible after migration.  A much worse case
is when a file is replaced without it being noticed: In this case, the
destination will open the other file, but present it as the old one to the
guest, with no error indication at all.  That can lead to data corruption.

The migration destination cannot detect this case without performing specific
checks, because opening the path it has received from the source will succeed
(but yield the wrong file).  Such checks are:

* `--migration-verify-handles`: With this switch, source and destination
  generate a file handle for each transferred path.  A file handle is a piece
  of data that uniquely identifies a filesystem object (like a file or
  directory), and becomes invalid (“stale”) when that object is deleted; so we
  can use it to verify a file’s identity between source and destination.
  However, it only works when source and destination use the same shared
  directory on the same filesystem (e.g. a network filesystem).  Furthermore,
  any mismatches that are detected cannot be recovered from (i.e. we still don’t
  know the involved files’ true paths, so `--migration-on-error` will decide how
  to proceed).
* `--migration-confirm-paths`: This switch makes the source instance
  double-check all paths during switch-over, i.e. when both the source and
  destination instance are stopped.  While this can theoretically allow error
  recovery (by fetching an updated path from */proc/self/fd*), and does not
  require source and destination to use the same filesystem, it still leaves a
  small TOCTTOU window open (between checking and the destination instance
  opening the paths), and it requires doing potentially quite a bit of I/O
  (checking paths) during migration downtime, which is generally not desirable.

Both switches can also be used together, but they can only be used in
*find-paths* migration mode, not *file-handles* (because they simply are not
necessary in *file-handles* mode).  Check the [dedicated section for more
information on recommended configurations](#recommended-configurations).

#### Implementation Detail: Collecting Paths

There are two ways paths can be collected, either by [looking up FDs in
*/proc/self/fd*](#querying-procselffd), or by [recursing through the shared
directory](#recursing-through-shared-directory).  virtiofsd implements both of
these, but only uses the latter as a fall-back for when the former fails.

##### Querying /proc/self/fd

*/proc/self/fd* contains a symbolic link for each file descriptor opened by the
current process.  These aren’t really symbolic links, though: Opening them does
not resolve their link target, but directly opens (basically duplicates) the
corresponding file descriptor.

Still, these links can have valid targets: The kernel tries to keep track
internally what paths the underlying filesystem objects have, and provides this
information there.  Querying this is thus a much faster way to get a path for
our file descriptors than to recurse through the shared directory.

The downside is that there is no formal guarantee that this works.  It is
unclear under what circumstances this can break down; if it does, virtiofsd will
fall back to [recursing through the shared
directory](#recursing-through-shared-directory).

For what it’s worth, the only case we have seen where a file has a valid path,
but */proc/self/fd* cannot provide it, is to use its file handle to open the
file, when it has not yet been opened through its path.  For example:

1. Open file using path
2. Generate and store file handle
3. Unmount file system, then mount it again
4. Open stored file handle

Something like this can happen with virtiofsd only on the migration destination
instance after a *file-handles* migration; in other cases, virtiofsd will
generally open files by path first, giving the kernel a chance to make a note of
that path.

##### Recursing Through Shared Directory

We can also obtain files’ paths by recursing through the shared directory,
enumerating all paths therein, and associating them with the respective files
and directories.  Naturally, this is quite slow, especially the more files there
are in the shared directory, which is why virtiofsd will only fall back to this
implementation if it fails to query a path from */proc/self/fd*.

### As File Handles (`--migration-mode=file-handles`)

Every filesystem object that must be transferred to the destination is converted
to a file handle (a piece of data that uniquely identifies this object on a
given filesystem, and can be used to open it), which is sent to the destination.
Because there is a unique and permanent relationship between such an object and
its file handle, this migration mode is not susceptible to the problems “by
path” migration has, for example, a file handle even stays valid when a file has
a link count of 0 (i.e. is deleted, has no path anymore) but some process still
has it open (i.e. holds an FD).

However, because file handles are just some data that allows access to
everything on a filesystem without checking e.g. access rights along a file’s
path, opening them requires the *DAC_READ_SEARCH* capability, which grants the
ability to read any file, regardless of its access mode.  Generally, this
capability is only available to applications running as root.

Furthermore, because file handles are specific to a given filesystem instance,
when using them for virtio-fs migration, the source and destination instance
must use the same shared directory on the same filesystem, e.g. a shared network
filesystem.

Recommended Configurations
--------------------------

### General

Consider which **`--migration-on-error`** mode suits your needs:

* `abort`: When any error is encountered (e.g. destination cannot find a file
  that is open in the guest), abort migration altogether.  You can then
  generally resume execution on the source; the source virtiofsd instance will
  retain all open file descriptors until it is quit.
* `guest-error`: When encountering errors pertaining to a specific file or
  directory, do not abort migration, but instead mark that file or directory as
  invalid.  Any guest accesses to it will then result in guest-visible errors.

### Shared Filesystems

When source and destination instance use the same shared directory on the same
filesystem, using **`--migration-mode=file-handles`** is recommended.  This
requires the destination instance to have the `DAC_READ_SEARCH` capability.

If that capability cannot be provided, we recommend using
**`--migration-mode=find-paths`** together with
**`--migration-verify-handles`**.  Using **`--migration-confirm-paths`**
additionally is optional; it can better recover from unexpected path changes
than `verify-handles` alone, but will prolong migration downtime.

### Different Filesystem

If the source and destination shared directory are not the exact same directory
on the same filesystem, users must ensure their contents are equal at migration
switch-over.  For example, read-only configuration directories presented to the
guest via virtio-fs can just be copied over to the destination ahead of
migration.

For such cases, use **`--migration-mode=find-paths`**.

We also recommend the filesystem to be read-only, which can be reinforced with
virtiofsd’s **`--readonly`** switch.  If that is not possible, *take special
care* to ensure source and destination directory contents match at the
switch-over point in time!

If, during migration, it is possible for the shared directory contents to be
modified by a party other than the migrating virtiofsd instance, we strongly
recommend using **`--migration-confirm-paths`**.  Still, that is not a 100 %
safe solution.  So above all, for the case where source and destination instance
do not use the same shared directory on the same (shared) filesystem, we
strongly advise not to allow the shared directory to be modified at all during
migration.