File: io_uring_registered_buffers.7

package info (click to toggle)
liburing 2.14-1
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 3,448 kB
  • sloc: ansic: 59,512; sh: 816; makefile: 603; cpp: 32
file content (238 lines) | stat: -rw-r--r-- 7,254 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
.\" Copyright (C) 2025 Jens Axboe <axboe@kernel.dk>
.\" SPDX-License-Identifier: LGPL-2.0-or-later
.\"
.TH io_uring_registered_buffers 7 "January 18, 2025" "Linux" "Linux Programmer's Manual"
.SH NAME
io_uring_registered_buffers \- io_uring registered buffers overview
.SH DESCRIPTION
Registered buffers are a performance optimization feature of
.B io_uring
that allows applications to pre-register a set of buffers with the kernel.
When buffers are registered, the kernel pins the memory and creates
long-term mappings, eliminating the overhead of mapping and unmapping
buffer memory for each I/O operation.
.SS Why use registered buffers?
For every I/O operation that transfers data between user space and the
kernel, the kernel must perform several operations on the buffer memory:
.IP \(bu 2
Verify the memory is accessible to the process
.IP \(bu
Pin the pages in memory to prevent them from being swapped out
.IP \(bu
Set up kernel mappings to access the memory
.PP
These operations, while individually fast, add up when performing many
small I/O operations. By registering buffers once upfront, these costs
are paid only once, and subsequent I/O operations can use the pre-mapped
buffers directly.

Registered buffers are most beneficial for applications that:
.IP \(bu 2
Perform many small I/O operations
.IP \(bu
Reuse the same buffers repeatedly
.IP \(bu
Need the lowest possible per-I/O overhead
.SS Registering buffers
Buffers are registered using
.BR io_uring_register_buffers (3)
or
.BR io_uring_register_buffers_tags (3).
The buffers are described using an array of
.I struct iovec
structures:
.PP
.in +4n
.EX
struct iovec iovecs[2];
iovecs[0].iov_base = buf1;
iovecs[0].iov_len = 4096;
iovecs[1].iov_base = buf2;
iovecs[1].iov_len = 8192;

ret = io_uring_register_buffers(ring, iovecs, 2);
.EE
.in
.PP
The buffers must be anonymous memory (allocated via
.BR malloc (3),
.BR mmap (2)
with
.BR MAP_ANONYMOUS ,
or similar). File-backed memory is not supported.

There is a limit of 1 GiB per individual buffer. Huge pages are supported
and the entire huge page will be pinned even if only part of it is used.

The buffers are charged against the user's
.B RLIMIT_MEMLOCK
resource limit on kernels before 5.12. On kernel 5.12 and later with
.B IORING_FEAT_NATIVE_WORKERS
support, cgroup memory accounting is used instead and no memlock limit
applies.
.PP
Unless running as root, if buffer registration fails with
.BR ENOMEM ,
the memlock limit may need to be increased. The current limit can be
checked with:
.PP
.in +4n
.EX
ulimit -l
.EE
.in
.PP
The limit can be increased for the current shell session with:
.PP
.in +4n
.EX
ulimit -l unlimited
.EE
.in
.PP
For a permanent change, edit
.I /etc/security/limits.conf
or use
.BR setrlimit (2)
programmatically with
.BR RLIMIT_MEMLOCK .
.SS Using registered buffers
To use a registered buffer in an I/O operation, use the fixed buffer
variants of the prep functions:
.IP \(bu 2
.BR io_uring_prep_read_fixed (3)
instead of
.BR io_uring_prep_read (3)
.IP \(bu
.BR io_uring_prep_write_fixed (3)
instead of
.BR io_uring_prep_write (3)
.IP \(bu
.BR io_uring_prep_readv_fixed (3)
instead of
.BR io_uring_prep_readv (3)
.IP \(bu
.BR io_uring_prep_writev_fixed (3)
instead of
.BR io_uring_prep_writev (3)
.PP
Zero-copy send operations can also use registered buffers:
.IP \(bu 2
.BR io_uring_prep_send_zc (3)
with
.B IORING_RECVSEND_FIXED_BUF
.IP \(bu
.BR io_uring_prep_sendmsg_zc (3)
with
.B IORING_RECVSEND_FIXED_BUF
.PP
These functions take a
.I buf_index
parameter that specifies which registered buffer to use (0-indexed into
the array passed to
.BR io_uring_register_buffers (3)).

The memory range used for the I/O operation must fall within the bounds
of the registered buffer. It is valid to use only a portion of a
registered buffer for an operation.
.PP
.in +4n
.EX
/* Use first 1024 bytes of registered buffer 0 */
io_uring_prep_read_fixed(sqe, fd, buf1, 1024, offset, 0);

/* Use registered buffer 1 */
io_uring_prep_write_fixed(sqe, fd, buf2, 2048, offset, 1);
.EE
.in
.SS Sparse buffer registration
Applications can register a sparse buffer table using
.BR io_uring_register_buffers_sparse (3).
This creates a table with empty slots that can be filled in later using
.BR io_uring_register_buffers_update_tag (3).
This is useful when the full set of buffers is not known at registration
time.
.PP
.in +4n
.EX
/* Create sparse table with 10 slots */
ret = io_uring_register_buffers_sparse(ring, 10);

/* Later, fill in slot 3 */
struct iovec iov = { .iov_base = buf, .iov_len = 4096 };
ret = io_uring_register_buffers_update_tag(ring, 3, &iov, NULL, 1);
.EE
.in
.SS Buffer tagging
When using
.BR io_uring_register_buffers_tags (3)
or
.BR io_uring_register_buffers_update_tag (3),
each buffer can be associated with a tag value. When a buffer is
unregistered (either explicitly or by replacing it), and there are no
more in-flight operations using that buffer, a completion queue entry
is posted with
.I user_data
set to the tag value and all other fields zeroed.

This allows applications to know when it is safe to free or reuse the
buffer memory.
.SS Updating registered buffers
Registered buffers can be updated in place using
.BR io_uring_register_buffers_update_tag (3).
This can:
.IP \(bu 2
Replace an existing buffer with a new one
.IP \(bu
Fill in a sparse slot
.IP \(bu
Remove a buffer by setting the iovec to zero length
.PP
Updating buffers does not immediately free resources. The old buffer
remains valid until all in-flight operations complete.
.SS Unregistering buffers
Buffers are unregistered using
.BR io_uring_unregister_buffers (3).
This releases all registered buffers. Buffers are also automatically
unregistered when the io_uring instance is destroyed.

Applications do not need to explicitly unregister buffers before
shutting down the ring. However, page unpinning may happen asynchronously,
so pages may not be immediately available after ring destruction.
.SS Cloning buffers
Registered buffers can be cloned from one ring to another using
.BR io_uring_clone_buffers (3)
or
.BR io_uring_clone_buffers_offset (3).
This allows multiple rings to share the same set of registered buffers
without re-registering them.
.SH NOTES
.IP \(bu 2
Registered buffers provide the most benefit for small, frequent I/O
operations where the per-operation overhead is significant.
.IP \(bu
For large I/O operations, the buffer mapping overhead is small relative
to the actual I/O time, so registered buffers may not provide much benefit.
.IP \(bu
The maximum number of registered buffers is limited by available kernel
memory and the
.B RLIMIT_MEMLOCK
limit (on older kernels).
.IP \(bu
Registered buffers cannot be used with provided buffer rings
.RB ( IOSQE_BUFFER_SELECT ).
These are separate mechanisms for different use cases.
.SH SEE ALSO
.BR io_uring (7),
.BR io_uring_registered_files (7),
.BR setrlimit (2),
.BR io_uring_register_buffers (3),
.BR io_uring_register_buffers_tags (3),
.BR io_uring_register_buffers_sparse (3),
.BR io_uring_register_buffers_update_tag (3),
.BR io_uring_unregister_buffers (3),
.BR io_uring_prep_read_fixed (3),
.BR io_uring_prep_write_fixed (3),
.BR io_uring_prep_send_zc (3),
.BR io_uring_prep_sendmsg_zc (3),
.BR io_uring_clone_buffers (3)