File: Feature-by-version.rst

package info (click to toggle)
btrfs-progs 6.2-1%2Bdeb12u1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 17,244 kB
  • sloc: ansic: 114,376; sh: 9,576; python: 1,242; makefile: 820
file content (529 lines) | stat: -rw-r--r-- 20,688 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
Changes (feature/version)
=========================

Major features or significant feature enhancements by kernel version. For more
information look below.

The version states at which version a feature has been merged into the mainline
kernel. It does not tell anything about at which kernel version it is
considered mature enough for production use. For an estimation on stability of
features see [[Status]] page.

3.x
---

3.0 - scrub
        Read all data and verify checksums, repair if possible.

3.2 - auto raid repair
        Automatic repair of broken data from a good copy

3.2 - root backups
        Save a few previous versions of the most important tree roots at commit time, used by *-o recovery*

3.3 - integrity checker
        Optional infrastructure to verify integrity of written metadata blocks

3.3 - backref walking
        Groundwork to allow tracking owner of blocks, used via *inspect-internal*

3.3 - restriper
        RAID profiles can be changed on-line, balance filters

3.4 - big metadata blocks
        Support for metadata blocks larger than page size

        .. note::
           Default nodesize is 16k since btrfs-progs 3.12

3.4 - error handling
        Generic infrastructure for graceful error handling (EIO)

3.5 - device statistics
        Persistent statistics about device errors

3.5 - fsync speedup
        Noticeable improvements in fsync() implementation

3.6 - qgroups
        Subvolume-aware quotas

3.6 - send/receive
        Ability to transfer one filesystem via a data stream (full or
        incremental) and apply the changes on a remote filesystem.
3.7 - extrefs
        Hardlink count limit is lifted to 64k

        .. note::
           Default since btrfs-progs 3.12

3.7 - hole punching
        Implement the FALLOC_FL_PUNCH_HOLE mode of *fallocate*

3.8 - device replace
        Efficient replacement of existing device (add/remove in one go)

3.9 - raid 5/6 *(incomplete)*
        Basic support for RAID5/6 profiles, no crash resiliency, replace and scrub support

3.9 - snapshot-aware defrag
        Defrag does not break links between shared extents (snapshots, reflinked files)

        .. note::
           Disabled since 3.14 (and backported to some stable kernel versions)
           due to problems. Will be enabled in the future.

3.9 - lightweight send
        A mode of *send* that does not add the actual file data to the stream

3.9 - on-line label set/get
        Label editable on mounted filesystems

3.10 - skinny metadata
        Reduced metadata size (format change) of extents

       .. note::
          Default since btrfs-progs 3.18

3.10 - qgroup rescan
        Sync qgroups with existing filesystem data

3.12 - UUID tree
        A map of subvolume/UUID that vastly speeds up send/receive

3.12 - out-of-bound deduplication
        Support for deduplicating extents on a given set of files.

3.14 - no-holes
        No extent representation for file holes (format change), may reduce overall metadata consumption

3.14 - feature bits in sysfs
        /sys/fs/btrfs exports various bits about filesystem capabilities and feature support

3.16 - O_TMPFILE
        Mode of open() to safely create a temporary file

3.16 - search ioctl v2
        The extended SEARCH_TREE ioctl able to get more than a 4k data

3.18 - auto block group reclaim
        Automatically remove block groups (aka. chunks) that become completely empty.

3.19 - RAID56: scrub, replace
        Scrub and device replace works on RAID56 filesystems.

4.x
---

4.0 - store otime
        Save creation time (otime) for all new files and directories. For
        future use, current tool cannot read it directly.

4.2 - rootid ioctl accessible
        The INO_LOOKUP will return root id (id of the containing subvolume),
        unrestricted and to all users if the *treeid* is 0.

4.2 - dedupe possible on the same inode
        The EXTENT_SAME ioctl will accept the same inode as source and
        destination (ranges must not overlap).

4.3 - trim all free space
        Trim will be performed also on the space that's not allocated by the
        chunks, not only free space within the allocated chunks.

4.4 - balance filter updates
        Enhanced syntax and new balance filters:
        * limit=min..max
        * usage=min..max
        * stripes=min..max

4.5 - free space tree
        Improved implementation of free space cache (aka v2), using b-trees.

        .. note::
           Default since btrfs-progs 5.15, Kernel 4.9 fixes endianity bugs on
           big-endian machines, x86* is ok

4.5 - balance filter updates
        Conversion to data/DUP profile possible through balance filters -- on single-device filesystem.

        .. note::
           mkfs.btrfs allows creating DUP on single device in the non-mixed mode since 4.4

4.6 - max_inline default
        The default value of max_inline changed to 2048.

4.6 - read features from control device
        The existing ioctl GET_SUPPORTED_FEATURES can be now used on the
        control device (/dev/btrfs-control) and returns the supported features
        without any mounted filesystem.

4.7 - delete device by id
        Add new ioctl RM_DEV_V2, pass device to be deleted by its ID.

4.7 - more renameat2 modes
        Add support for RENAME_EXCHANGE and RENAME_WHITEOUT to *renameat2*
        syscall. This also means that *overlayfs* is now supported on top of
        btrfs.

4.7 - balance filter updates
        Conversion to data/DUP profile possible through balance filters -- on multiple-device filesystems.

        .. note::
           mkfs.btrfs allows creating DUP on multiple devices since 4.5.1

4.12 - RAID56: auto repair
        Scrub will attempt auto-repair (similar to raid1/raid10)

4.13 - statx
        Support for the enhanced statx syscall; file creation timestamp

4.13 - sysfs qgroups override
        qgroups: new sysfs control file to allow temporary quota override with CAP_SYS_RESOURCE

4.13 - *deprecated mount option alloc_start*
        That was a debugging helper, not used and not supposed to be used nowadays.

4.14 - ZSTD compression
        New compression algorithm ZSTD, supposedly better ratio/speed performance.

4.14 - improved degraded mount
        Allow degraded mount based on the chunk constraints, not device number
        constraints. E.g. when one device is missing but the remaining one holds
        all *single* chunks.

4.14 - *deprecated user transaction ioctl*
        BTRFS_IOC_TRANS_START and BTRFS_IOC_TRANS_END, no known users, tricky
        to use; scheduled to be removed in 4.17

4.14 - refine SSD optimizations
        The mount option *ssd* does not make any assumptions about block layout
        or management by the device anymore, leaving only the speedups based on
        low seek cost active.  This could avoid some corner cases leading to
        excessive fragmentation.
        https://git.kernel.org/linus/583b723151794e2ff1691f1510b4e43710293875
        The story so far.

4.15 - overlayfs
        Overlayfs can now use btrfs as the lower filesystem.

4.15 - *ref-verify*
        Debugging functionality to verify extent references. New mount option
        <i>ref-verify</i>, must be built with CONFIG_BTRFS_FS_REF_VERIFY.

4.15 - ZLIB level
        Allow to set the ZLIB compression level via mount option, e.g. like
        *compress=zlib:9*. The levels match the default ZLIB compression
        levels. The default is 3.

4.15 - v2 of LOGICAL_INO ioctl
        An enhanced version of ioctl that can translate logical extent offset
        to inode numbers, "who owns this block". For certain use cases the V1
        performs bad and this is addressed by V2.
        [https://git.kernel.org/linus/d24a67b2d997c860a42516076f3315c2ad2d2884
        Read more.]

4.15 - compression heuristics
        Apply a few heuristics to the data before they're compressed to decide
        if it's likely to gain any space savings. The methods: frequency
        sampling, repeated pattern detection, Shannon entropy calculation.

4.16 - fallocate: zero range
        Mode of the [http://man7.org/linux/man-pages/man2/fallocate.2.html
        *fallocate*] syscall to zero file range.

4.17 - *removed user transaction ioctl*
        deprecated in 4.14, see above

4.17 - *rmdir* on subvolumes
        Allow rmdir to delete an empty subvolume.

4.18 - XFLAGS ioctl
        Add support for ioctl FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR, successor of
        FS_IOC_SETFLAGS/FS_IOC_GETFLAGS ioctl. Currently supports: APPEND,
        IMMUTABLE, NOATIME, NODUMP, SYNC. Note that the naming is very
        confusing, though it's named *xattr*, it does not mean the extended
        attributes. It should be referenced as extended inode flags or
        *xflags*.

4.18 - EXTENT_SAME ioctl / 16MiB chunks
        The range for out-of-band deduplication implemented by the EXTENT_SAME
        ioctl will split the range into 16MiB chunks. Up to now this was the
        overall limit and effectively only the first 16MiB was deduplicated.

4.18 - GET_SUBVOL_INFO ioctl
        New ioctl to read subvolume information (id, directory name,
        generation, flags, UUIDs, time). This does not require root
        permissions, only the regular access to to the subvolume.

4.18 - GET_SUBVOL_ROOTREF ioctl
        New ioctl to enumerate subvolume references of a given subvolume. This
        does not require root permissions, only the regular access to to the
        subvolume.

4.18 - INO_LOOKUP_USER ioctl
        New ioctl to lookup path by inode number. This does not require root
        permissions, only the regular access to to the subvolume, unlike the
        INO_LOOKUP ioctl.

4.19 - defrag ro/rw
        Allow to run defrag on files that are normally accessible for
        read-write, but are currently opened in read-only mode.

5.x
---

5.0 - swapfile
        With some limitations where COW design does not work well with the swap
        implementation (nodatacow file, no compression, cannot be snapshotted,
        not possible on multiple devices, ...), as this is the most restricted
        but working setup, we'll try to improve that in the future

5.0 - metadata uuid
        An optional incompat feature to assign a new filesystem UUID without
        overwriting all metadata blocks, stored only in superblock, unlike what
        ``btrfstune -u``

5.1 - FORGET_DEV ioctl
        Unregister devices previously added by the scan ioctl, same effect as
        if the kernel module is reloaded.

5.1 - ZSTD level
        Allow to set the ZSTD compression level via mount option, e.g. like
        *compress=zstd:9*. The levels match the default ZSTD compression
        levels. The default is 3, maximum is 15.

5.2 - pre-write checks
        Verify metadata blocks before submitting them to the devices. This can
        catch consistency problems or bitflips.

5.5 - more checksums
        New checksum algorithms: xxhash (64b), SHA256 (256b), BLAKE2b (256b).

5.5 - RAID1C34
        RAID1 with 3- and 4- copies (over all devices).

5.6 - async discard
        Mode of discard (*mount -o discard=async*) that merges freed extents to
        larger chunks and submits them for discard in a less intrusive way

5.6 - device info in sysfs
        More information about device state can be found in per-filesystem sysfs directory.

5.7 - reflink/clone works on inline files
        Inline files can be reflinked to the tail extent of other files

5.7 - faster balance cancel
        More cancellation points in balance that will shorten the time to stop
        processing once ``btrfs balance cancel`` is called.

5.7 - *removed flag BTRFS_SUBVOL_CREATE_ASYNC*
        Remove support of flag BTRFS_SUBVOL_CREATE_ASYNC from subvolume creation ioctl.

5.7 - v2 of snapshot deletion ioctl
        New ioctl BTRFS_IOC_SNAP_DESTROY_V2, deletion by subvolume id is now possible.

5.9 - mount option *rescue*
        Unified mount option for actions that may help to access a damaged
        filesystem. Now supports: nologreplay, usebackuproot

5.9 - qgroups in sysfs
        The information about qgroup status and relations is exported in */sys/fs/UUID/qgroups*

5.9 - FS_INFO ioctl
        Export more information: checksum type, checksum size, generation, metadata_uuid

5.10 - exclusive ops in sysfs
        Export which filesystem exclusive operation is running (balance,
        resize, device add/delete/replace, ...)

5.11 - remove *inode_cache*
        Remove inode number caching feature (mount -o inode_cache)

5.11 - more rescue=
        Additional modes for mount option *rescue=*: ignorebadroots/ibadroots,
        ignoredatacsums/idatacsums. All are exported in sysfs.

5.12 - zoned mode
        Support for zoned devices with special allocation/write mode to
        fixed-size zones. See [[Zoned]].

5.13 - supported_sectorsizes in sysfs
        List supported sector sizes in sysfs file /sys/fs/btrfs/features/supported_sectorsizes

5.14 - sysfs scrub bw limit
        Tunable bandwidth limit
        (/sys/fs/btrfs/FSID/devinfo/DEVID/scrub_speed_max) for scrub (and
        device replace) for a given device.

5.14 - sysfs device stats
        The device stats can be also found in /sys/fs/btrfs/FSID/devinfo/DEVID/error_stats.

5.14 - cancellable resize, device delete
        The filesystem resize and device delete operations can be cancelled by
        specifying *cancel* as the device name.

5.14 - property value reset
        Change how empty value is interpreted. New behaviour will delete the
        value and reset it to default. This affects *btrfs.compression* where
        value *no* sets NOCOMPRESS bit while empty value resets all compression
        settings (either compression or NOCOMPRESS bit).

5.15 - fsverity
        The fs-verity is a support layer that filesystems can hook into to
        support transparent integrity and authenticity protection of read-only
        files. https://www.kernel.org/doc/html/latest/filesystems/fsverity.html

5.15 - idmapped mount
        Support mount with UID/GID mapped according to another namespace.
        https://lwn.net/Articles/837566/

5.16 - ZNS in zoned
        Zoned namespaces. https://zonedstorage.io/docs/introduction/zns ,
        https://lwn.net/Articles/865988/

5.17 - send and relocation
        Send and relocation (balance, device remove, shrink, block group
        reclaim) can now work in parallel.

5.17 - device add vs balance
        It is possible to add a device with paused balance.

        .. note::
           Since kernel 5.17.7 and btrfs-progs 5.17.1

5.17 - *no warning with flushoncommit*
        Mounting with *-o flushoncommit* does not trigger the (harmless)
        warning at each transaction commit

        .. note::
           Also backported to 5.15.27 and 5.16.13

5.18 - zoned and DUP metadata
        DUP metadata works with zoned mode

5.18 - encoded data ioctl
        New ioctls to read and write pre-encoded data (i.e. no transformation
        and directly written as extents), now works for compressed data

5.18 - *removed balance ioctl v1*
        The support for ioctl BTRFS_IOC_BALANCE has been removed, superseded by
        BTRFS_IOC_BALANCE_V2m long time ago

5.18 - *cross-mount reflink works*
        the VFS limitation to reflink files on separate subvolume mounts of the
        same filesystem has been removed

5.18 - syslog error messages with filesystem state
        Messages are printed with a one letter tag ("state: X") that denotes in
        which state the filesystem was at this point:

        * A - transaction aborted (permanent)
        * E - filesystem error (permanent)
        * M - remount in progress (transient)
        * R - device replace in progress (transient)
        * C - checksum checks disabled by mount option (rescue=ignoredatacsums)
        * L - log tree replay did not complete due to some error

5.18 - tree-checker verifies transaction id pre-write
        Metadata buffer to be written gets an extra check if the stored
        transaction number matches the current state of the filesystem.

5.19 - subpage support pages > 4KiB
        Metadata node size is supported regardless of the CPU page size
        (minimum size is 4KiB), data sector size is supported <= page size.
        Additionally subpage also supports RAID56.

5.19 - per-type background threshold for reclaim
        Add sysfs tunable for background reclaim threshold for all block group
        types (data, metadata, system).

5.19 - automatically repair device number mismatch
        Device information is stored in two places, the number in the super
        block and items in the device tree. When this is goes out of sync, e.g.
        by device removal short before unmount, the next mount could fail.
        The b-tree is an authoritative information an can be used to override
        the stale value in the superblock.

5.19 - defrag can convert inline files to regular ones
        The logic has been changed so that inline files are considered for
        defragmentation even if the mount option max_inline would prevent that.
        No defragmentation might happen but the inlined files are not skipped.

5.19 - explicit minimum zone size is 4MiB
        Set the minimum limit of zone on zoned devices to 4MiB. Real devices
        zones are much larger, this is for emulated devices.

5.19 - sysfs tunable for automatic block group reclaim
        Add possibility to set a threshold to automatically reclaim block groups
        also in non-zoned mode. By default completely empty block groups are
        reclaimed automatically but the threshold can be tuned in
        /sys/fs/btrfs/FSID/allocation/PROFILE/bg_reclaim_threshold .

5.19 - tree-checker verifies metadata block ownership
        Additional check done by tree-checker to verify relationship between a
        tree block and it's tree root owner.

6.x
---

6.0 - send protocol v2
        Send protocol update that adds new commands and extends existing
        functionality to write large data chunks. Compressed (and encrypted)
        extents can be optionally emitted and transferred as-is without the need
        to recompress (or reencrypt) on the receiving side.

6.0 - sysfs exports commit stats
        The file /sys/fs/btrfs/FSID/commit_stats shows number of commits and
        various time related statistics.

6.0 - sysfs exports chunk sizes
        Chunk size value can be read from
        /sys/fs/btrfs/FSID/allocation/PROFILE/chunk_size .

6.0 - sysfs shows zoned mode among features
        The zoned mode has been supported since 5.10 and adding functionality.
        Now it's advertised among features.

6.0 - checksum implementation is logged at mount time
        When a filesystem is mounted the implementation backing the checksums
        is logged. The information is also accessible in
        /sys/fs/btrfs/FSID/checksum .

6.1 - sysfs support to temporarily skip exact qgroup accounting
        Allow user override of qgroup accounting and make it temporarily out
        of date e.g. in case when there are several subvolumes deleted and the
        qgroup numbers need to be updated at some cost, an update after that
        can amortize the costs.

6.1 - scrub also repairs superblock
        An improvement to scrub in case the superblock is detected to be
        corrupted, the repair happens immediately. Previously it was delayed
        until the next transaction commit for performance reasons that would
        store an updated and correct copy eventually.

6.1 - block group tree
        An incompatible change that has to be enabled at mkfs time. Add a new
        b-tree item that stores information about block groups in a compact way
        that significantly improves mount time that's usually long due to
        fragmentation and scattered b-tree items tracking the individual block
        groups. Requires and also enables the free-space-tree and no-holes
        features.

6.1 - discard stats available in sysfs
        The directory '/sys/fs/btrfs/FSID/discard' exports statistics and
        tunables related to discard.

6.1 - additional qgroup stats in sysfs
        The overall status of qgroups are exported in
        /sys/sys/fs/btrfs/FSID/qgroups/ .

6.1 - check that subperblock is unchanged at thaw time
        Do full check of super block once a filesystem is thawed. This namely
        happens when system resumes from suspend or hibernation. Accidental
        change by other operating systems will be detected.

6.2 - discard=async on by default
        Devices that support trim/discard will enable the asynchronous discard
        for the whole filesystem.