File: checkpoint.md

package info (click to toggle)
docker.io 26.1.5%2Bdfsg1-9
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 68,576 kB
  • sloc: sh: 5,748; makefile: 912; ansic: 664; asm: 228; python: 162
file content (112 lines) | stat: -rw-r--r-- 4,287 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# checkpoint

<!---MARKER_GEN_START-->
Manage checkpoints

### Subcommands

| Name                             | Description                                  |
|:---------------------------------|:---------------------------------------------|
| [`create`](checkpoint_create.md) | Create a checkpoint from a running container |
| [`ls`](checkpoint_ls.md)         | List checkpoints for a container             |
| [`rm`](checkpoint_rm.md)         | Remove a checkpoint                          |



<!---MARKER_GEN_END-->

## Description

Checkpoint and Restore is an experimental feature that allows you to freeze a running
container by specifying a checkpoint, which turns the container state into a collection of files
on disk. Later, the container can be restored from the point it was frozen.

This is accomplished using a tool called [CRIU](https://criu.org), which is an
external dependency of this feature. A good overview of the history of
checkpoint and restore in Docker is available in this
[Kubernetes blog post](https://kubernetes.io/blog/2015/07/how-did-quake-demo-from-dockercon-work/).

### Installing CRIU

If you use a Debian system, you can add the CRIU PPA and install with `apt-get`
[from the CRIU launchpad](https://launchpad.net/~criu/+archive/ubuntu/ppa).

Alternatively, you can [build CRIU from source](https://criu.org/Installation).

You need at least version 2.0 of CRIU to run checkpoint and restore in Docker.

### Use cases for checkpoint and restore

This feature is currently focused on single-host use cases for checkpoint and
restore. Here are a few:

- Restarting the host machine without stopping/starting containers
- Speeding up the start time of slow start applications
- "Rewinding" processes to an earlier point in time
- "Forensic debugging" of running processes

Another primary use case of checkpoint and restore outside of Docker is the live
migration of a server from one machine to another. This is possible with the
current implementation, but not currently a priority (and so the workflow is
not optimized for the task).

### Using checkpoint and restore

A new top level command `docker checkpoint` is introduced, with three subcommands:

- `docker checkpoint create` (creates a new checkpoint)
- `docker checkpoint ls` (lists existing checkpoints)
- `docker checkpoint rm` (deletes an existing checkpoint)

Additionally, a `--checkpoint` flag is added to the `docker container start` command.

The options for `docker checkpoint create`:

```console
Usage:  docker checkpoint create [OPTIONS] CONTAINER CHECKPOINT

Create a checkpoint from a running container

  --leave-running=false    Leave the container running after checkpoint
  --checkpoint-dir         Use a custom checkpoint storage directory
```

And to restore a container:

```console
Usage:  docker start --checkpoint CHECKPOINT_ID [OTHER OPTIONS] CONTAINER
```

Example of using checkpoint and restore on a container:

```console
$ docker run --security-opt=seccomp:unconfined --name cr -d busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
abc0123

$ docker checkpoint create cr checkpoint1

# <later>
$ docker start --checkpoint checkpoint1 cr
abc0123
```

This process just logs an incrementing counter to stdout. If you run `docker logs`
in-between running/checkpoint/restoring, you should see that the counter
increases while the process is running, stops while it's frozen, and
resumes from the point it left off once you restore.

### Known limitations

`seccomp` is only supported by CRIU in very up-to-date kernels.

External terminals (i.e. `docker run -t ..`) aren't supported.
If you try to create a checkpoint for a container with an external terminal,
it fails:

```console
$ docker checkpoint create cr checkpoint1
Error response from daemon: Cannot checkpoint container c1: rpc error: code = 2 desc = exit status 1: "criu failed: type NOTIFY errno 0\nlog file: /var/lib/docker/containers/eb62ebdbf237ce1a8736d2ae3c7d88601fc0a50235b0ba767b559a1f3c5a600b/checkpoints/checkpoint1/criu.work/dump.log\n"

$ cat /var/lib/docker/containers/eb62ebdbf237ce1a8736d2ae3c7d88601fc0a50235b0ba767b559a1f3c5a600b/checkpoints/checkpoint1/criu.work/dump.log
Error (mount.c:740): mnt: 126:./dev/console doesn't have a proper root mount
```