File: checkpoint.md

package info (click to toggle)
docker.io 20.10.24%2Bdfsg1-1%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bookworm-proposed-updates
  • size: 60,824 kB
  • sloc: sh: 5,621; makefile: 593; ansic: 179; python: 162; asm: 7
file content (102 lines) | stat: -rw-r--r-- 3,931 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
title: docker checkpoint
description: "The checkpoint command description and usage"
keywords: experimental, checkpoint, restore, criu
experimental: true
---

## Description

Checkpoint and Restore is an experimental feature that allows you to freeze a running
container by checkpointing it, which turns its state into a collection of files
on disk. Later, the container can be restored from the point it was frozen.

This is accomplished using a tool called [CRIU](https://criu.org), which is an
external dependency of this feature. A good overview of the history of
checkpoint and restore in Docker is available in this
[Kubernetes blog post](https://kubernetes.io/blog/2015/07/how-did-quake-demo-from-dockercon-work/).

### Installing CRIU

If you use a Debian system, you can add the CRIU PPA and install with `apt-get`
[from the criu launchpad](https://launchpad.net/~criu/+archive/ubuntu/ppa).

Alternatively, you can [build CRIU from source](https://criu.org/Installation).

You need at least version 2.0 of CRIU to run checkpoint and restore in Docker.

### Use cases for checkpoint and restore

This feature is currently focused on single-host use cases for checkpoint and
restore. Here are a few:

- Restarting the host machine without stopping/starting containers
- Speeding up the start time of slow start applications
- "Rewinding" processes to an earlier point in time
- "Forensic debugging" of running processes

Another primary use case of checkpoint and restore outside of Docker is the live
migration of a server from one machine to another. This is possible with the
current implementation, but not currently a priority (and so the workflow is
not optimized for the task).

### Using checkpoint and restore

A new top level command `docker checkpoint` is introduced, with three subcommands:

- `docker checkpoint create` (creates a new checkpoint)
- `docker checkpoint ls` (lists existing checkpoints)
- `docker checkpoint rm` (deletes an existing checkpoint)

Additionally, a `--checkpoint` flag is added to the `docker container start` command.

The options for `docker checkpoint create`:

```console
Usage:  docker checkpoint create [OPTIONS] CONTAINER CHECKPOINT

Create a checkpoint from a running container

  --leave-running=false    Leave the container running after checkpoint
  --checkpoint-dir         Use a custom checkpoint storage directory
```

And to restore a container:

```console
Usage:  docker start --checkpoint CHECKPOINT_ID [OTHER OPTIONS] CONTAINER
```

Example of using checkpoint and restore on a container:

```console
$ docker run --security-opt=seccomp:unconfined --name cr -d busybox /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done'
abc0123

$ docker checkpoint create cr checkpoint1

# <later>
$ docker start --checkpoint checkpoint1 cr
abc0123
```

This process just logs an incrementing counter to stdout. If you run `docker logs`
in between running/checkpoint/restoring you should see that the counter
increases while the process is running, stops while it's checkpointed, and
resumes from the point it left off once you restore.

### Known limitations

seccomp is only supported by CRIU in very up to date kernels.

External terminal (i.e. `docker run -t ..`) is not supported at the moment.
If you try to create a checkpoint for a container with an external terminal,
it would fail:

```console
$ docker checkpoint create cr checkpoint1
Error response from daemon: Cannot checkpoint container c1: rpc error: code = 2 desc = exit status 1: "criu failed: type NOTIFY errno 0\nlog file: /var/lib/docker/containers/eb62ebdbf237ce1a8736d2ae3c7d88601fc0a50235b0ba767b559a1f3c5a600b/checkpoints/checkpoint1/criu.work/dump.log\n"

$ cat /var/lib/docker/containers/eb62ebdbf237ce1a8736d2ae3c7d88601fc0a50235b0ba767b559a1f3c5a600b/checkpoints/checkpoint1/criu.work/dump.log
Error (mount.c:740): mnt: 126:./dev/console doesn't have a proper root mount
```