File: dev-btrees.rst

package info (click to toggle)
btrfs-progs 6.17.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 20,612 kB
  • sloc: ansic: 127,282; sh: 7,915; python: 1,384; makefile: 900; asm: 296
file content (145 lines) | stat: -rw-r--r-- 4,683 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
Btrees
======

B-trees introduction
--------------------

Btrfs uses a single set of btree manipulation code for all metadata in the
filesystem. For performance or organizational purposes, the trees are broken up
into a few different types, and each type of tree holds different types of
keys. The super block holds pointers to the tree roots of the tree of tree
roots and the chunk tree.


Tree of tree roots (tree_root)
------------------------------

This tree stores pointers to other trees in the filesystem. It holds the
association of subvolume names and their tree, pointer to the default subvolume
and tracks deletion progress of a subvolume.


Chunk Tree
----------

The chunk tree does all of the logical to physical block address mapping
for the filesystem, and it stores information about all of the devices
in the FS. In order to bootstrap lookup in the chunk tree, the super
block also duplicates the chunk items needed to resolve blocks in the
chunk tree.

There are back references from the chunk items to the extent tree that
allocated them. Only a single extent tree can allocate extents out of a
given chunk.

Two types of key are stored in the chunk tree:

-  DEV_ITEM (where the offset field is the devid), which contain information on
   all of the underlying block devices in the filesystem
-  CHUNK_ITEM (where the offset field is the start of the chunk as a
   virtual address), which maps a section of the virtual address space
   (a chunk) into physical storage.


Device Allocation Tree
----------------------

The device allocation tree records which parts of each physical device
have been allocated into chunks. This is a relatively small tree that is
only updated as new chunks are allocated. It stores back references to
the chunk tree that allocated each physical extent on the device.


Extent Tree
-----------

The extent tree records byte ranges that are in use, maintains reference counts
on each extent and records back references to the tree or file that is using
each extent. Logical block groups are created inside the extent tree, and these
reference large logical extents from the chunk tree.

Each block group can only store a specific type of extent. This might include
metadata, or mirrored metadata, or striped data blocks etc.

Currently there is only one extent tree shared by all the other trees.

Keys for the extent tree use the start of the extent as the objectid. A
BLOCK_GROUP_ITEM key will be followed by the EXTENT_ITEM keys for extents within
that block group.


Filesystem tree(s)
------------------

The filesystem trees store files and directories of subvolumes.  There is one
root for each subvolume or snapshot. Extents can be shared among various
filesystem trees. The toplevel subvolume has a special filesystem tree that
always exists.

Keys in FS trees always use the inode number of the filesystem object as the
objectid.

Each object haves one or more of:

-  Inode.
-  Inode ref, indicating what name this object is known as, and in which
   directory.
-  For files, a set of extent information, indicating where on the
   filesystem this file's data is.
-  For directories, two sequences of dir_items, one indexed by a hash of
   the object name, and one indexed by a unique sequential index number.


Checksum tree
-------------

The checksum tree stores block checksums. Every 4k block of data stored
on disk has a checksum associated with it. The "offset" part of the keys
in the checksum tree indicates the start of the checksummed data on
disk. The value stored with the key is a sequence of checksums, for the 4k
blocks starting at the offset.


Data relocation tree
--------------------


Log root tree
-------------


UUID tree
---------

The tree stores correspondence between UUIDs and subvolumes. Used for
quick lookup during send.


Quota tree
----------

The qgroup information about status and qgroup relations are stored in this tree.
The tree exists only when quotas are enabled.


Free space tree
---------------

This tree implements *space_cache=v2*, which is a tree-based tracking of free
space. Successor of the v1 code which used inodes to store the space information.


Block Group Tree
----------------

Separate tree (and feature) that stores only block group items and allows quick lookup
during mount. Otherwise the block group items are scattered in the Extent tree
and cause slow mount due to excessive seeking.


Raid Stripe Tree
----------------

A separate tracking of file extents and block groups that allows more flexible
location of physical offsets while keeping the logical offsets the same. This
is used by zoned mode and raid56.