1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303
|
.\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
.\"
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date. The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\"
.\" 2006-02-03, mtk, substantial wording changes and other improvements
.\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
.\" more precise specification of behavior.
.\"
.TH SET_MEMPOLICY 2 2008-08-15 Linux "Linux Programmer's Manual"
.SH NAME
set_mempolicy \- set default NUMA memory policy for a process and its children
.SH SYNOPSIS
.nf
.B "#include <numaif.h>"
.sp
.BI "int set_mempolicy(int " mode ", unsigned long *" nodemask ,
.BI " unsigned long " maxnode );
.sp
Link with \fI\-lnuma\fP.
.fi
.SH DESCRIPTION
.BR set_mempolicy ()
sets the NUMA memory policy of the calling process,
which consists of a policy mode and zero or more nodes,
to the values specified by the
.IR mode ,
.I nodemask
and
.I maxnode
arguments.
A NUMA machine has different
memory controllers with different distances to specific CPUs.
The memory policy defines from which node memory is allocated for
the process.
This system call defines the default policy for the process.
The process policy governs allocation of pages in the process's
address space outside of memory ranges
controlled by a more specific policy set by
.BR mbind (2).
The process default policy also controls allocation of any pages for
memory mapped files mapped using the
.BR mmap (2)
call with the
.B MAP_PRIVATE
flag and that are only read [loaded] from by the process
and of memory mapped files mapped using the
.BR mmap (2)
call with the
.B MAP_SHARED
flag, regardless of the access type.
The policy is only applied when a new page is allocated
for the process.
For anonymous memory this is when the page is first
touched by the application.
The
.I mode
argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.B MPOL_INTERLEAVE
or
.BR MPOL_PREFERRED .
All modes except
.B MPOL_DEFAULT
require the caller to specify via the
.I nodemask
argument one or more nodes.
The
.I mode
argument may also include an optional
.IR "mode flag" .
The supported
.I "mode flags"
are:
.TP
.BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
A nonempty
.I nodemask
specifies physical node ids.
Linux does will not remap the
.I nodemask
when the process moves to a different cpuset context,
nor when the set of nodes allowed by the process's
current cpuset context changes.
.TP
.BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
A nonempty
.I nodemask
specifies node ids that are relative to the set of
node ids allowed by the process's current cpuset.
.PP
.I nodemask
points to a bit mask of node IDs that contains up to
.I maxnode
bits.
The bit mask size is rounded to the next multiple of
.IR "sizeof(unsigned long)" ,
but the kernel will only use bits up to
.IR maxnode .
A NULL value of
.I nodemask
or a
.I maxnode
value of zero specifies the empty set of nodes.
If the value of
.I maxnode
is zero,
the
.I nodemask
argument is ignored.
Where a
.I nodemask
is required, it must contain at least one node that is on-line,
allowed by the process's current cpuset context,
[unless the
.B MPOL_F_STATIC_NODES
mode flag is specified],
and contains memory.
If the
.B MPOL_F_STATIC_NODES
is set in
.I mode
and a required
.I nodemask
contains no nodes that are allowed by the process's current cpuset context,
the memory policy reverts to
.IR "local allocation" .
This effectively overrides the specified policy until the process's
cpuset context includes one or more of the nodes specified by
.IR nodemask.
The
.B MPOL_DEFAULT
mode specifies that any nondefault process memory policy be removed,
so that the memory policy "falls back" to the system default policy.
The system default policy is "local allocation"--
i.e., allocate memory on the node of the CPU that triggered the allocation.
.I nodemask
must be specified as NULL.
If the "local node" contains no free memory, the system will
attempt to allocate memory from a "near by" node.
The
.B MPOL_BIND
mode defines a strict policy that restricts memory allocation to the
nodes specified in
.IR nodemask .
If
.I nodemask
specifies more than one node, page allocations will come from
the node with the lowest numeric node ID first, until that node
contains no free memory.
Allocations will then come from the node with the next highest
node ID specified in
.I nodemask
and so forth, until none of the specified nodes contain free memory.
Pages will not be allocated from any node not specified in the
.IR nodemask .
.B MPOL_INTERLEAVE
interleaves page allocations across the nodes specified in
.I nodemask
in numeric node ID order.
This optimizes for bandwidth instead of latency
by spreading out pages and memory accesses to those pages across
multiple nodes.
However, accesses to a single page will still be limited to
the memory bandwidth of a single node.
.\" NOTE: the following sentence doesn't make sense in the context
.\" of set_mempolicy() -- no memory area specified.
.\" To be effective the memory area should be fairly large,
.\" at least 1MB or bigger.
.B MPOL_PREFERRED
sets the preferred node for allocation.
The kernel will try to allocate pages from this node first
and fall back to "near by" nodes if the preferred node is low on free
memory.
If
.I nodemask
specifies more than one node ID, the first node in the
mask will be selected as the preferred node.
If the
.I nodemask
and
.I maxnode
arguments specify the empty set, then the policy
specifies "local allocation"
(like the system default policy discussed above).
The process memory policy is preserved across an
.BR execve (2),
and is inherited by child processes created using
.BR fork (2)
or
.BR clone (2).
.SH RETURN VALUE
On success,
.BR set_mempolicy ()
returns 0;
on error, \-1 is returned and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
.B EFAULT
Part of all of the memory range specified by
.I nodemask
and
.I maxnode
points outside your accessible address space.
.TP
.B EINVAL
.I mode
is invalid.
Or,
.I mode
is
.B MPOL_DEFAULT
and
.I nodemask
is nonempty,
or
.I mode
is
.B MPOL_BIND
or
.B MPOL_INTERLEAVE
and
.I nodemask
is empty.
Or,
.I maxnode
specifies more than a page worth of bits.
Or,
.I nodemask
specifies one or more node IDs that are
greater than the maximum supported node ID.
Or, none of the node IDs specified by
.I nodemask
are on-line and allowed by the process's current cpuset context,
or none of the specified nodes contain memory.
Or, the
.I mode
argument specified both
.B MPOL_F_STATIC_NODES
and
.BR MPOL_F_RELATIVE_NODES .
.TP
.B ENOMEM
Insufficient kernel memory was available.
.SH VERSIONS
The
.BR set_mempolicy (),
system call was added to the Linux kernel in version 2.6.7.
.SH CONFORMING TO
This system call is Linux-specific.
.SH NOTES
Process policy is not remembered if the page is swapped out.
When such a page is paged back in, it will use the policy of
the process or memory range that is in effect at the time the
page is allocated.
For information on library support, see
.BR numa (7).
.SH SEE ALSO
.BR get_mempolicy (2),
.BR getcpu (2),
.BR mbind (2),
.BR mmap (2),
.BR numa (3),
.BR cpuset (7),
.BR numa (7),
.BR numactl (8)
.SH COLOPHON
This page is part of release 3.27 of the Linux
.I man-pages
project.
A description of the project,
and information about reporting bugs,
can be found at
http://www.kernel.org/doc/man-pages/.
|