1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
|
.\" Copyright (C) 2025 Jens Axboe <axboe@kernel.dk>
.\" SPDX-License-Identifier: LGPL-2.0-or-later
.\"
.TH io_uring_registered_buffers 7 "January 18, 2025" "Linux" "Linux Programmer's Manual"
.SH NAME
io_uring_registered_buffers \- io_uring registered buffers overview
.SH DESCRIPTION
Registered buffers are a performance optimization feature of
.B io_uring
that allows applications to pre-register a set of buffers with the kernel.
When buffers are registered, the kernel pins the memory and creates
long-term mappings, eliminating the overhead of mapping and unmapping
buffer memory for each I/O operation.
.SS Why use registered buffers?
For every I/O operation that transfers data between user space and the
kernel, the kernel must perform several operations on the buffer memory:
.IP \(bu 2
Verify the memory is accessible to the process
.IP \(bu
Pin the pages in memory to prevent them from being swapped out
.IP \(bu
Set up kernel mappings to access the memory
.PP
These operations, while individually fast, add up when performing many
small I/O operations. By registering buffers once upfront, these costs
are paid only once, and subsequent I/O operations can use the pre-mapped
buffers directly.
Registered buffers are most beneficial for applications that:
.IP \(bu 2
Perform many small I/O operations
.IP \(bu
Reuse the same buffers repeatedly
.IP \(bu
Need the lowest possible per-I/O overhead
.SS Registering buffers
Buffers are registered using
.BR io_uring_register_buffers (3)
or
.BR io_uring_register_buffers_tags (3).
The buffers are described using an array of
.I struct iovec
structures:
.PP
.in +4n
.EX
struct iovec iovecs[2];
iovecs[0].iov_base = buf1;
iovecs[0].iov_len = 4096;
iovecs[1].iov_base = buf2;
iovecs[1].iov_len = 8192;
ret = io_uring_register_buffers(ring, iovecs, 2);
.EE
.in
.PP
The buffers must be anonymous memory (allocated via
.BR malloc (3),
.BR mmap (2)
with
.BR MAP_ANONYMOUS ,
or similar). File-backed memory is not supported.
There is a limit of 1 GiB per individual buffer. Huge pages are supported
and the entire huge page will be pinned even if only part of it is used.
The buffers are charged against the user's
.B RLIMIT_MEMLOCK
resource limit on kernels before 5.12. On kernel 5.12 and later with
.B IORING_FEAT_NATIVE_WORKERS
support, cgroup memory accounting is used instead and no memlock limit
applies.
.PP
Unless running as root, if buffer registration fails with
.BR ENOMEM ,
the memlock limit may need to be increased. The current limit can be
checked with:
.PP
.in +4n
.EX
ulimit -l
.EE
.in
.PP
The limit can be increased for the current shell session with:
.PP
.in +4n
.EX
ulimit -l unlimited
.EE
.in
.PP
For a permanent change, edit
.I /etc/security/limits.conf
or use
.BR setrlimit (2)
programmatically with
.BR RLIMIT_MEMLOCK .
.SS Using registered buffers
To use a registered buffer in an I/O operation, use the fixed buffer
variants of the prep functions:
.IP \(bu 2
.BR io_uring_prep_read_fixed (3)
instead of
.BR io_uring_prep_read (3)
.IP \(bu
.BR io_uring_prep_write_fixed (3)
instead of
.BR io_uring_prep_write (3)
.IP \(bu
.BR io_uring_prep_readv_fixed (3)
instead of
.BR io_uring_prep_readv (3)
.IP \(bu
.BR io_uring_prep_writev_fixed (3)
instead of
.BR io_uring_prep_writev (3)
.PP
Zero-copy send operations can also use registered buffers:
.IP \(bu 2
.BR io_uring_prep_send_zc (3)
with
.B IORING_RECVSEND_FIXED_BUF
.IP \(bu
.BR io_uring_prep_sendmsg_zc (3)
with
.B IORING_RECVSEND_FIXED_BUF
.PP
These functions take a
.I buf_index
parameter that specifies which registered buffer to use (0-indexed into
the array passed to
.BR io_uring_register_buffers (3)).
The memory range used for the I/O operation must fall within the bounds
of the registered buffer. It is valid to use only a portion of a
registered buffer for an operation.
.PP
.in +4n
.EX
/* Use first 1024 bytes of registered buffer 0 */
io_uring_prep_read_fixed(sqe, fd, buf1, 1024, offset, 0);
/* Use registered buffer 1 */
io_uring_prep_write_fixed(sqe, fd, buf2, 2048, offset, 1);
.EE
.in
.SS Sparse buffer registration
Applications can register a sparse buffer table using
.BR io_uring_register_buffers_sparse (3).
This creates a table with empty slots that can be filled in later using
.BR io_uring_register_buffers_update_tag (3).
This is useful when the full set of buffers is not known at registration
time.
.PP
.in +4n
.EX
/* Create sparse table with 10 slots */
ret = io_uring_register_buffers_sparse(ring, 10);
/* Later, fill in slot 3 */
struct iovec iov = { .iov_base = buf, .iov_len = 4096 };
ret = io_uring_register_buffers_update_tag(ring, 3, &iov, NULL, 1);
.EE
.in
.SS Buffer tagging
When using
.BR io_uring_register_buffers_tags (3)
or
.BR io_uring_register_buffers_update_tag (3),
each buffer can be associated with a tag value. When a buffer is
unregistered (either explicitly or by replacing it), and there are no
more in-flight operations using that buffer, a completion queue entry
is posted with
.I user_data
set to the tag value and all other fields zeroed.
This allows applications to know when it is safe to free or reuse the
buffer memory.
.SS Updating registered buffers
Registered buffers can be updated in place using
.BR io_uring_register_buffers_update_tag (3).
This can:
.IP \(bu 2
Replace an existing buffer with a new one
.IP \(bu
Fill in a sparse slot
.IP \(bu
Remove a buffer by setting the iovec to zero length
.PP
Updating buffers does not immediately free resources. The old buffer
remains valid until all in-flight operations complete.
.SS Unregistering buffers
Buffers are unregistered using
.BR io_uring_unregister_buffers (3).
This releases all registered buffers. Buffers are also automatically
unregistered when the io_uring instance is destroyed.
Applications do not need to explicitly unregister buffers before
shutting down the ring. However, page unpinning may happen asynchronously,
so pages may not be immediately available after ring destruction.
.SS Cloning buffers
Registered buffers can be cloned from one ring to another using
.BR io_uring_clone_buffers (3)
or
.BR io_uring_clone_buffers_offset (3).
This allows multiple rings to share the same set of registered buffers
without re-registering them.
.SH NOTES
.IP \(bu 2
Registered buffers provide the most benefit for small, frequent I/O
operations where the per-operation overhead is significant.
.IP \(bu
For large I/O operations, the buffer mapping overhead is small relative
to the actual I/O time, so registered buffers may not provide much benefit.
.IP \(bu
The maximum number of registered buffers is limited by available kernel
memory and the
.B RLIMIT_MEMLOCK
limit (on older kernels).
.IP \(bu
Registered buffers cannot be used with provided buffer rings
.RB ( IOSQE_BUFFER_SELECT ).
These are separate mechanisms for different use cases.
.SH SEE ALSO
.BR io_uring (7),
.BR io_uring_registered_files (7),
.BR setrlimit (2),
.BR io_uring_register_buffers (3),
.BR io_uring_register_buffers_tags (3),
.BR io_uring_register_buffers_sparse (3),
.BR io_uring_register_buffers_update_tag (3),
.BR io_uring_unregister_buffers (3),
.BR io_uring_prep_read_fixed (3),
.BR io_uring_prep_write_fixed (3),
.BR io_uring_prep_send_zc (3),
.BR io_uring_prep_sendmsg_zc (3),
.BR io_uring_clone_buffers (3)
|