File: librpmem.7

package info (click to toggle)
pmdk 1.5.1-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 31,076 kB
  • sloc: ansic: 144,239; sh: 29,351; cpp: 10,136; perl: 5,122; makefile: 3,531; pascal: 1,383; python: 677
file content (445 lines) | stat: -rw-r--r-- 17,030 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
.\" Automatically generated by Pandoc 2.1.3
.\"
.TH "LIBRPMEM" "7" "2018-10-17" "PMDK - rpmem API version 1.2" "PMDK Programmer's Manual"
.hy
.\" Copyright 2014-2018, Intel Corporation
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\"
.\"     * Redistributions of source code must retain the above copyright
.\"       notice, this list of conditions and the following disclaimer.
.\"
.\"     * Redistributions in binary form must reproduce the above copyright
.\"       notice, this list of conditions and the following disclaimer in
.\"       the documentation and/or other materials provided with the
.\"       distribution.
.\"
.\"     * Neither the name of the copyright holder nor the names of its
.\"       contributors may be used to endorse or promote products derived
.\"       from this software without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
.\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
.\" A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
.\" OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
.\" OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
.SH NAME
.PP
\f[B]librpmem\f[] \- remote persistent memory support library
(EXPERIMENTAL)
.SH SYNOPSIS
.IP
.nf
\f[C]
#include\ <librpmem.h>
cc\ ...\ \-lrpmem
\f[]
.fi
.SS Library API versioning:
.IP
.nf
\f[C]
const\ char\ *rpmem_check_version(
\ \ \ \ unsigned\ major_required,
\ \ \ \ unsigned\ minor_required);
\f[]
.fi
.SS Error handling:
.IP
.nf
\f[C]
const\ char\ *rpmem_errormsg(void);
\f[]
.fi
.SS Other library functions:
.PP
A description of other \f[B]librpmem\f[] functions can be found on the
following manual pages:
.IP \[bu] 2
\f[B]rpmem_create\f[](3), \f[B]rpmem_persist\f[](3)
.SH DESCRIPTION
.PP
\f[B]librpmem\f[] provides low\-level support for remote access to
\f[I]persistent memory\f[] (pmem) utilizing RDMA\-capable RNICs.
The library can be used to remotely replicate a memory region over the
RDMA protocol.
It utilizes an appropriate persistency mechanism based on the remote
node's platform capabilities.
\f[B]librpmem\f[] utilizes the \f[B]ssh\f[](1) client to authenticate a
user on the remote node, and for encryption of the connection's
out\-of\-band configuration data.
See \f[B]SSH\f[], below, for details.
.PP
The maximum replicated memory region size can not be bigger than the
maximum locked\-in\-memory address space limit.
See \f[B]memlock\f[] in \f[B]limits.conf\f[](5) for more details.
.PP
This library is for applications that use remote persistent memory
directly, without the help of any library\-supplied transactions or
memory allocation.
Higher\-level libraries that build on \f[B]libpmem\f[](7) are available
and are recommended for most applications, see:
.IP \[bu] 2
\f[B]libpmemobj\f[](7), a general use persistent memory API, providing
memory allocation and transactional operations on variable\-sized
objects.
.SH TARGET NODE ADDRESS FORMAT
.IP
.nf
\f[C]
[<user>\@]<hostname>[:<port>]
\f[]
.fi
.PP
The target node address is described by the \f[I]hostname\f[] which the
client connects to, with an optional \f[I]user\f[] name.
The user must be authorized to authenticate to the remote machine
without querying for password/passphrase.
The optional \f[I]port\f[] number is used to establish the SSH
connection.
The default port number is 22.
.SH REMOTE POOL ATTRIBUTES
.PP
The \f[I]rpmem_pool_attr\f[] structure describes a remote pool and is
stored in remote pool's metadata.
This structure must be passed to the \f[B]rpmem_create\f[](3) function
by caller when creating a pool on remote node.
When opening the pool using \f[B]rpmem_open\f[](3) function the
appropriate fields are read from pool's metadata and returned back to
the caller.
.IP
.nf
\f[C]
#define\ RPMEM_POOL_HDR_SIG_LEN\ \ \ \ 8
#define\ RPMEM_POOL_HDR_UUID_LEN\ \ \ 16
#define\ RPMEM_POOL_USER_FLAGS_LEN\ 16

struct\ rpmem_pool_attr\ {
\ \ \ \ char\ signature[RPMEM_POOL_HDR_SIG_LEN];
\ \ \ \ uint32_t\ major;
\ \ \ \ uint32_t\ compat_features;
\ \ \ \ uint32_t\ incompat_features;
\ \ \ \ uint32_t\ ro_compat_features;
\ \ \ \ unsigned\ char\ poolset_uuid[RPMEM_POOL_HDR_UUID_LEN];
\ \ \ \ unsigned\ char\ uuid[RPMEM_POOL_HDR_UUID_LEN];
\ \ \ \ unsigned\ char\ next_uuid[RPMEM_POOL_HDR_UUID_LEN];
\ \ \ \ unsigned\ char\ prev_uuid[RPMEM_POOL_HDR_UUID_LEN];
\ \ \ \ unsigned\ char\ user_flags[RPMEM_POOL_USER_FLAGS_LEN];
};
\f[]
.fi
.PP
The \f[I]signature\f[] field is an 8\-byte field which describes the
pool's on\-media format.
.PP
The \f[I]major\f[] field is a major version number of the pool's
on\-media format.
.PP
The \f[I]compat_features\f[] field is a mask describing compatibility of
pool's on\-media format optional features.
.PP
The \f[I]incompat_features\f[] field is a mask describing compatibility
of pool's on\-media format required features.
.PP
The \f[I]ro_compat_features\f[] field is a mask describing compatibility
of pool's on\-media format features.
If these features are not available, the pool shall be opened in
read\-only mode.
.PP
The \f[I]poolset_uuid\f[] field is an UUID of the pool which the remote
pool is associated with.
.PP
The \f[I]uuid\f[] field is an UUID of a first part of the remote pool.
This field can be used to connect the remote pool with other pools in a
list.
.PP
The \f[I]next_uuid\f[] and \f[I]prev_uuid\f[] fields are UUIDs of next
and previous replicas respectively.
These fields can be used to connect the remote pool with other pools in
a list.
.PP
The \f[I]user_flags\f[] field is a 16\-byte user\-defined flags.
.SH SSH
.PP
\f[B]librpmem\f[] utilizes the \f[B]ssh\f[](1) client to login and
execute the \f[B]rpmemd\f[](1) process on the remote node.
By default, \f[B]ssh\f[](1) is executed with the \f[B]\-4\f[] option,
which forces using \f[B]IPv4\f[] addressing.
.PP
For debugging purposes, both the ssh client and the commands executed on
the remote node may be overridden by setting the \f[B]RPMEM_SSH\f[] and
\f[B]RPMEM_CMD\f[] environment variables, respectively.
See \f[B]ENVIRONMENT\f[] for details.
.SH FORK
.PP
The \f[B]ssh\f[](1) client is executed by \f[B]rpmem_open\f[](3) and
\f[B]rpmem_create\f[](3) after forking a child process using
\f[B]fork\f[](2).
The application must take this into account when using \f[B]wait\f[](2)
and \f[B]waitpid\f[](2), which may return the \f[I]PID\f[] of the
\f[B]ssh\f[](1) process executed by \f[B]librpmem\f[].
.PP
If \f[B]fork\f[](2) support is not enabled in \f[B]libibverbs\f[],
\f[B]rpmem_open\f[](3) and \f[B]rpmem_create\f[](3) will fail.
By default, \f[B]fabric\f[](7) initializes \f[B]libibverbs\f[] with
\f[B]fork\f[](2) support by calling the \f[B]ibv_fork_init\f[](3)
function.
See \f[B]fi_verbs\f[](7) for more details.
.SH CAVEATS
.PP
\f[B]librpmem\f[] relies on the library destructor being called from the
main thread.
For this reason, all functions that might trigger destruction (e.g.
\f[B]dlclose\f[](3)) should be called in the main thread.
Otherwise some of the resources associated with that thread might not be
cleaned up properly.
.PP
\f[B]librpmem\f[] registers a pool as a single memory region.
A Chelsio T4 and T5 hardware can not handle a memory region greater than
or equal to 8GB due to a hardware bug.
So \f[I]pool_size\f[] value for \f[B]rpmem_create\f[](3) and
\f[B]rpmem_open\f[](3) using this hardware can not be greater than or
equal to 8GB.
.SH LIBRARY API VERSIONING
.PP
This section describes how the library API is versioned, allowing
applications to work with an evolving API.
.PP
The \f[B]rpmem_check_version\f[]() function is used to see if the
installed \f[B]librpmem\f[] supports the version of the library API
required by an application.
The easiest way to do this is for the application to supply the
compile\-time version information, supplied by defines in
\f[B]<librpmem.h>\f[], like this:
.IP
.nf
\f[C]
reason\ =\ rpmem_check_version(RPMEM_MAJOR_VERSION,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ RPMEM_MINOR_VERSION);
if\ (reason\ !=\ NULL)\ {
\ \ \ \ /*\ version\ check\ failed,\ reason\ string\ tells\ you\ why\ */
}
\f[]
.fi
.PP
Any mismatch in the major version number is considered a failure, but a
library with a newer minor version number will pass this check since
increasing minor versions imply backwards compatibility.
.PP
An application can also check specifically for the existence of an
interface by checking for the version where that interface was
introduced.
These versions are documented in this man page as follows: unless
otherwise specified, all interfaces described here are available in
version 1.0 of the library.
Interfaces added after version 1.0 will contain the text \f[I]introduced
in version x.y\f[] in the section of this manual describing the feature.
.PP
When the version check performed by \f[B]rpmem_check_version\f[]() is
successful, the return value is NULL.
Otherwise the return value is a static string describing the reason for
failing the version check.
The string returned by \f[B]rpmem_check_version\f[]() must not be
modified or freed.
.SH ENVIRONMENT
.PP
\f[B]librpmem\f[] can change its default behavior based on the following
environment variables.
These are largely intended for testing and are not normally required.
.IP \[bu] 2
\f[B]RPMEM_SSH\f[]=\f[I]ssh_client\f[]
.PP
Setting this environment variable overrides the default \f[B]ssh\f[](1)
client command name.
.IP \[bu] 2
\f[B]RPMEM_CMD\f[]=\f[I]cmd\f[]
.PP
Setting this environment variable overrides the default command executed
on the remote node using either \f[B]ssh\f[](1) or the alternative
remote shell command specified by \f[B]RPMEM_SSH\f[].
.PP
\f[B]RPMEM_CMD\f[] can contain multiple commands separated by a vertical
bar (\f[C]|\f[]).
Each consecutive command is executed on the remote node in order read
from a pool set file.
This environment variable is read when the library is initialized, so
\f[B]RPMEM_CMD\f[] must be set prior to application launch (or prior to
\f[B]dlopen\f[](3) if \f[B]librpmem\f[] is being dynamically loaded).
.IP \[bu] 2
\f[B]RPMEM_ENABLE_SOCKETS\f[]=0|1
.PP
Setting this variable to 1 enables using \f[B]fi_sockets\f[](7) provider
for in\-band RDMA connection.
The \f[I]sockets\f[] provider does not support IPv6.
It is required to disable IPv6 system wide if
\f[B]RPMEM_ENABLE_SOCKETS\f[] == 1 and \f[I]target\f[] == localhost (or
any other loopback interface address) and \f[B]SSH_CONNECTION\f[]
variable (see \f[B]ssh\f[](1) for more details) contains IPv6 address
after ssh to loopback interface.
By default the \f[I]sockets\f[] provider is disabled.
.IP \[bu] 2
\f[B]RPMEM_ENABLE_VERBS\f[]=0|1
.PP
Setting this variable to 0 disables using \f[B]fi_verbs\f[](7) provider
for in\-band RDMA connection.
The \f[I]verbs\f[] provider is enabled by default.
.IP \[bu] 2
\f[B]RPMEM_MAX_NLANES\f[]=\f[I]num\f[]
.PP
Limit the maximum number of lanes to \f[I]num\f[].
See \f[B]LANES\f[], in \f[B]rpmem_create\f[](3), for details.
.SH DEBUGGING AND ERROR HANDLING
.PP
If an error is detected during the call to a \f[B]librpmem\f[] function,
the application may retrieve an error message describing the reason for
the failure from \f[B]rpmem_errormsg\f[]().
This function returns a pointer to a static buffer containing the last
error message logged for the current thread.
If \f[I]errno\f[] was set, the error message may include a description
of the corresponding error code as returned by \f[B]strerror\f[](3).
The error message buffer is thread\-local; errors encountered in one
thread do not affect its value in other threads.
The buffer is never cleared by any library function; its content is
significant only when the return value of the immediately preceding call
to a \f[B]librpmem\f[] function indicated an error, or if \f[I]errno\f[]
was set.
The application must not modify or free the error message string, but it
may be modified by subsequent calls to other library functions.
.PP
Two versions of \f[B]librpmem\f[] are typically available on a
development system.
The normal version, accessed when a program is linked using the
\f[B]\-lrpmem\f[] option, is optimized for performance.
That version skips checks that impact performance and never logs any
trace information or performs any run\-time assertions.
.PP
A second version of \f[B]librpmem\f[], accessed when a program uses the
libraries under \f[B]/usr/lib/pmdk_debug\f[], contains run\-time
assertions and trace points.
The typical way to access the debug version is to set the environment
variable \f[B]LD_LIBRARY_PATH\f[] to \f[B]/usr/lib/pmdk_debug\f[] or
\f[B]/usr/lib64/pmdk_debug\f[], as appropriate.
Debugging output is controlled using the following environment
variables.
These variables have no effect on the non\-debug version of the library.
.IP \[bu] 2
\f[B]RPMEM_LOG_LEVEL\f[]
.PP
The value of \f[B]RPMEM_LOG_LEVEL\f[] enables trace points in the debug
version of the library, as follows:
.IP \[bu] 2
\f[B]0\f[] \- This is the default level when \f[B]RPMEM_LOG_LEVEL\f[] is
not set.
No log messages are emitted at this level.
.IP \[bu] 2
\f[B]1\f[] \- Additional details on any errors detected are logged (in
addition to returning the \f[I]errno\f[]\-based errors as usual).
The same information may be retrieved using \f[B]rpmem_errormsg\f[]().
.IP \[bu] 2
\f[B]2\f[] \- A trace of basic operations is logged.
.IP \[bu] 2
\f[B]3\f[] \- Enables a very verbose amount of function call tracing in
the library.
.IP \[bu] 2
\f[B]4\f[] \- Enables voluminous and fairly obscure tracing information
that is likely only useful to the \f[B]librpmem\f[] developers.
.PP
Unless \f[B]RPMEM_LOG_FILE\f[] is set, debugging output is written to
\f[I]stderr\f[].
.IP \[bu] 2
\f[B]RPMEM_LOG_FILE\f[]
.PP
Specifies the name of a file where all logging information should be
written.
If the last character in the name is \[lq]\-\[rq], the \f[I]PID\f[] of
the current process will be appended to the file name when the log file
is created.
If \f[B]RPMEM_LOG_FILE\f[] is not set, logging output is written to
\f[I]stderr\f[].
.SH EXAMPLE
.PP
The following example uses \f[B]librpmem\f[] to create a remote pool on
given target node identified by given pool set name.
The associated local memory pool is zeroed and the data is made
persistent on remote node.
Upon success the remote pool is closed.
.IP
.nf
\f[C]
#include\ <stdio.h>
#include\ <string.h>

#include\ <librpmem.h>

#define\ POOL_SIZE\ \ \ \ (32\ *\ 1024\ *\ 1024)
#define\ NLANES\ \ \ \ \ \ \ \ 4
unsigned\ char\ pool[POOL_SIZE];

int
main(int\ argc,\ char\ *argv[])
{
\ \ \ \ int\ ret;
\ \ \ \ unsigned\ nlanes\ =\ NLANES;

\ \ \ \ /*\ fill\ pool_attributes\ */
\ \ \ \ struct\ rpmem_pool_attr\ pool_attr;
\ \ \ \ memset(&pool_attr,\ 0,\ sizeof(pool_attr));

\ \ \ \ /*\ create\ a\ remote\ pool\ */
\ \ \ \ RPMEMpool\ *rpp\ =\ rpmem_create("localhost",\ "pool.set",
\ \ \ \ \ \ \ \ \ \ \ \ pool,\ POOL_SIZE,\ &nlanes,\ &pool_attr);
\ \ \ \ if\ (!rpp)\ {
\ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_create:\ %s\\n",\ rpmem_errormsg());
\ \ \ \ \ \ \ \ return\ 1;
\ \ \ \ }

\ \ \ \ /*\ store\ data\ on\ local\ pool\ */
\ \ \ \ memset(pool,\ 0,\ POOL_SIZE);

\ \ \ \ /*\ make\ local\ data\ persistent\ on\ remote\ node\ */
\ \ \ \ ret\ =\ rpmem_persist(rpp,\ 0,\ POOL_SIZE,\ 0,\ 0);
\ \ \ \ if\ (ret)\ {
\ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_persist:\ %s\\n",\ rpmem_errormsg());
\ \ \ \ \ \ \ \ return\ 1;
\ \ \ \ }

\ \ \ \ /*\ close\ the\ remote\ pool\ */
\ \ \ \ ret\ =\ rpmem_close(rpp);
\ \ \ \ if\ (ret)\ {
\ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_close:\ %s\\n",\ rpmem_errormsg());
\ \ \ \ \ \ \ \ return\ 1;
\ \ \ \ }

\ \ \ \ return\ 0;
}
\f[]
.fi
.SH NOTE
.PP
The \f[B]librpmem\f[] API is experimental and may be subject to change
in the future.
However, using the remote replication in \f[B]libpmemobj\f[](7) is safe
and backward compatibility will be preserved.
.SH ACKNOWLEDGEMENTS
.PP
\f[B]librpmem\f[] builds on the persistent memory programming model
recommended by the SNIA NVM Programming Technical Work Group:
<http://snia.org/nvmp>
.SH SEE ALSO
.PP
\f[B]rpmemd\f[](1), \f[B]ssh\f[](1), \f[B]fork\f[](2),
\f[B]dlclose\f[](3), \f[B]dlopen\f[](3), \f[B]ibv_fork_init\f[](3),
\f[B]rpmem_create\f[](3), \f[B]rpmem_open\f[](3),
\f[B]rpmem_persist\f[](3), \f[B]strerror\f[](3),
\f[B]limits.conf\f[](5), \f[B]fabric\f[](7), \f[B]fi_sockets\f[](7),
\f[B]fi_verbs\f[](7), \f[B]libpmem\f[](7), \f[B]libpmemblk\f[](7),
\f[B]libpmemlog\f[](7), \f[B]libpmemobj\f[](7) and
\f[B]<http://pmem.io>\f[]