File: containers-storage-zstd-chunked.md

package info (click to toggle)
golang-github-containers-storage 1.59.1%2Bds1-2
  • links: PTS, VCS
  • area: main
  • in suites: experimental
  • size: 4,184 kB
  • sloc: sh: 630; ansic: 389; makefile: 143; awk: 12
file content (59 lines) | stat: -rw-r--r-- 3,225 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
## containers-storage 1 "August 2024"

## NAME
containers-storage-zstd-chunked - Information about zstd:chunked


## DESCRIPTION

The traditional format for container image layers is [application/vnd.oci.image.layer.v1.tar+gzip](https://github.com/opencontainers/image-spec/blob/main/layer.md#gzip-media-types).
More recently, the standard was augmented with zstd: [application/vnd.oci.image.layer.v1.tar+zstd](https://github.com/opencontainers/image-spec/blob/main/layer.md#zstd-media-types)
which is a more modern and efficient compression format.

`zstd:chunked` is a variant of the `application/vnd.oci.image.layer.v1.tar+zstd` media type that
uses zstd [skippable frames](https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#skippable-frames)
to include additional metadata (especially a "table of contents") that includes the SHA-256 and offsets of individual chunks of files.
Additionally chunks are compressed separately. This allows a client to dynamically fetch only content which
it doesn't already have using HTTP range requests.

At the time of this writing, support for this is enabled by default in the code.

You can explicitly enable or disable zstd:chunked with following changes to `containers-storage.conf`:

```
[storage.options.pull_options]
enable_partial_images = "true" | "false"
```

Note that the value of this field must be a "string bool", it cannot be a native TOML boolean.

## IMPLEMENTATION

Each layer has an associated "big data" key called `chunked-manifest-cache` that
is a custom binary format suitable for mmap() that contains index metadata
for each layer with the full sha256 digest of each file plus its "chunks" (as
computed by `zstd:chunked`).

When any image is pulled all existing other layers are scanned using `chunked-manifest-cache` to see if they contain a file with a matching digest. If one is found, the other file is hardlinked if `use_hardlinks = "true`",
otherwise it is reflinked (if supported by the filesystem, or a full physical copy
is made). There is a best-effort attempt to enable fsverity on the file if configured
(see <https://github.com/containers/storage/issues/2017>).

For more information, at the current time the file with the most information is [pkg/chunked/internal/compression.go](https://github.com/containers/storage/blob/39d469c34c96db67062e25954bc9d18f2bf6dae3/pkg/chunked/internal/compression.go).
The above is a permanent link for stability, but be sure to check to see if there are newer changes too.

## STANDARDIZATION

At the current time the format is not officially standardized or documented beyond
the comments and code in the reference implementation.

## BUGS

- https://github.com/containers/storage/issues?q=is%3Aissue+label%3Aarea%2Fzstd%3Achunked+is%3Aopen

## FOOTNOTES
The Containers Storage project is committed to inclusivity, a core value of open source.
The `master` and `slave` mount propagation terminology is used in this repository.
This language is problematic and divisive, and should be changed.
However, these terms are currently used within the Linux kernel and must be used as-is at this time.
When the kernel maintainers rectify this usage, Containers Storage will follow suit immediately.