File: hwlocality_memattrs.3

package info (click to toggle)
hwloc 2.4.1%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 22,032 kB
  • sloc: ansic: 58,129; xml: 12,064; sh: 6,822; makefile: 2,200; javascript: 1,623; perl: 380; cpp: 93; php: 8; sed: 4
file content (208 lines) | stat: -rw-r--r-- 12,497 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
.TH "hwlocality_memattrs" 3 "Thu Feb 11 2021" "Version 2.4.1" "Hardware Locality (hwloc)" \" -*- nroff -*-
.ad l
.nh
.SH NAME
hwlocality_memattrs \- Comparing memory node attributes for finding where to allocate on
.SH SYNOPSIS
.br
.PP
.SS "Data Structures"

.in +1c
.ti -1c
.RI "struct \fBhwloc_location\fP"
.br
.in -1c
.SS "Typedefs"

.in +1c
.ti -1c
.RI "typedef unsigned \fBhwloc_memattr_id_t\fP"
.br
.in -1c
.SS "Enumerations"

.in +1c
.ti -1c
.RI "enum \fBhwloc_memattr_id_e\fP { \fBHWLOC_MEMATTR_ID_CAPACITY\fP = 0, \fBHWLOC_MEMATTR_ID_LOCALITY\fP = 1, \fBHWLOC_MEMATTR_ID_BANDWIDTH\fP = 2, \fBHWLOC_MEMATTR_ID_LATENCY\fP = 3 }"
.br
.ti -1c
.RI "enum \fBhwloc_location_type_e\fP { \fBHWLOC_LOCATION_TYPE_CPUSET\fP, \fBHWLOC_LOCATION_TYPE_OBJECT\fP }"
.br
.ti -1c
.RI "enum \fBhwloc_local_numanode_flag_e\fP { \fBHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY\fP, \fBHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY\fP, \fBHWLOC_LOCAL_NUMANODE_FLAG_ALL\fP }"
.br
.in -1c
.SS "Functions"

.in +1c
.ti -1c
.RI "int \fBhwloc_memattr_get_by_name\fP (\fBhwloc_topology_t\fP topology, const char *name, \fBhwloc_memattr_id_t\fP *id)"
.br
.ti -1c
.RI "int \fBhwloc_get_local_numanode_objs\fP (\fBhwloc_topology_t\fP topology, struct \fBhwloc_location\fP *location, unsigned *nr, \fBhwloc_obj_t\fP *nodes, unsigned long flags)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_value\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target_node, struct \fBhwloc_location\fP *initiator, unsigned long flags, hwloc_uint64_t *value)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_best_target\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, struct \fBhwloc_location\fP *initiator, unsigned long flags, \fBhwloc_obj_t\fP *best_target, hwloc_uint64_t *value)"
.br
.ti -1c
.RI "int \fBhwloc_memattr_get_best_initiator\fP (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target, unsigned long flags, struct \fBhwloc_location\fP *best_initiator, hwloc_uint64_t *value)"
.br
.in -1c
.SH "Detailed Description"
.PP 
Platforms with heterogeneous memory require ways to decide whether a buffer should be allocated on 'fast' memory (such as HBM), 'normal' memory (DDR) or even 'slow' but large-capacity memory (non-volatile memory)\&. These memory nodes are called 'Targets' while the CPU accessing them is called the 'Initiator'\&. Access performance depends on their locality (NUMA platforms) as well as the intrinsic performance of the targets (heterogeneous platforms)\&.
.PP
The following attributes describe the performance of memory accesses from an Initiator to a memory Target, for instance their latency or bandwidth\&. Initiators performing these memory accesses are usually some PUs or Cores (described as a CPU set)\&. Hence a Core may choose where to allocate a memory buffer by comparing the attributes of different target memory nodes nearby\&.
.PP
There are also some attributes that are system-wide\&. Their value does not depend on a specific initiator performing an access\&. The memory node Capacity is an example of such attribute without initiator\&.
.PP
One way to use this API is to start with a cpuset describing the Cores where a program is bound\&. The best target NUMA node for allocating memory in this program on these Cores may be obtained by passing this cpuset as an initiator to \fBhwloc_memattr_get_best_target()\fP with the relevant memory attribute\&. For instance, if the code is latency limited, use the Latency attribute\&.
.PP
A more flexible approach consists in getting the list of local NUMA nodes by passing this cpuset to \fBhwloc_get_local_numanode_objs()\fP\&. Attribute values for these nodes, if any, may then be obtained with \fBhwloc_memattr_get_value()\fP and manually compared with the desired criteria\&.
.PP
\fBNote\fP
.RS 4
The API also supports specific objects as initiator, but it is currently not used internally by hwloc\&. Users may for instance use it to provide custom performance values for host memory accesses performed by GPUs\&.
.PP
The interface actually also accepts targets that are not NUMA nodes\&. 
.RE
.PP

.SH "Typedef Documentation"
.PP 
.SS "typedef unsigned \fBhwloc_memattr_id_t\fP"

.PP
A memory attribute identifier\&. May be either one of \fBhwloc_memattr_id_e\fP or a new id returned by \fBhwloc_memattr_register()\fP\&. 
.SH "Enumeration Type Documentation"
.PP 
.SS "enum \fBhwloc_local_numanode_flag_e\fP"

.PP
Flags for selecting target NUMA nodes\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY \fP\fP
Select NUMA nodes whose locality is larger than the given cpuset\&. For instance, if a single PU (or its cpuset) is given in \fCinitiator\fP, select all nodes close to the package that contains this PU\&. 
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY \fP\fP
Select NUMA nodes whose locality is smaller than the given cpuset\&. For instance, if a package (or its cpuset) is given in \fCinitiator\fP, also select nodes that are attached to only a half of that package\&. 
.TP
\fB\fIHWLOC_LOCAL_NUMANODE_FLAG_ALL \fP\fP
Select all NUMA nodes in the topology\&. The initiator \fCinitiator\fP is ignored\&. 
.SS "enum \fBhwloc_location_type_e\fP"

.PP
Type of location\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_LOCATION_TYPE_CPUSET \fP\fP
Location is given as a cpuset, in the location cpuset union field\&. 
.TP
\fB\fIHWLOC_LOCATION_TYPE_OBJECT \fP\fP
Location is given as an object, in the location object union field\&. 
.SS "enum \fBhwloc_memattr_id_e\fP"

.PP
Memory node attributes\&. 
.PP
\fBEnumerator\fP
.in +1c
.TP
\fB\fIHWLOC_MEMATTR_ID_CAPACITY \fP\fP
'Capacity'\&. The capacity is returned in bytes (local_memory attribute in objects)\&. Best capacity nodes are nodes with \fBhigher capacity\fP\&.
.PP
No initiator is involved when looking at this attribute\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_LOCALITY \fP\fP
'Locality'\&. The locality is returned as the number of PUs in that locality (e\&.g\&. the weight of its cpuset)\&. Best locality nodes are nodes with \fBsmaller locality\fP (nodes that are local to very few PUs)\&. Poor locality nodes are nodes with larger locality (nodes that are local to the entire machine)\&.
.PP
No initiator is involved when looking at this attribute\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_BANDWIDTH \fP\fP
'Bandwidth'\&. The bandwidth is returned in MiB/s, as seen from the given initiator location\&. Best bandwidth nodes are nodes with \fBhigher bandwidth\fP\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_HIGHER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.TP
\fB\fIHWLOC_MEMATTR_ID_LATENCY \fP\fP
'Latency'\&. The latency is returned as nanoseconds, as seen from the given initiator location\&. Best latency nodes are nodes with \fBsmaller latency\fP\&. The corresponding attribute flags are \fBHWLOC_MEMATTR_FLAG_LOWER_FIRST\fP and \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP\&. 
.SH "Function Documentation"
.PP 
.SS "int hwloc_get_local_numanode_objs (\fBhwloc_topology_t\fP topology, struct \fBhwloc_location\fP * location, unsigned * nr, \fBhwloc_obj_t\fP * nodes, unsigned long flags)"

.PP
Return an array of local NUMA nodes\&. By default only select the NUMA nodes whose locality is exactly the given \fClocation\fP\&. More nodes may be selected if additional flags are given as a OR'ed set of \fBhwloc_local_numanode_flag_e\fP\&.
.PP
If \fClocation\fP is given as an explicit object, its CPU set is used to find NUMA nodes with the corresponding locality\&. If the object does not have a CPU set (e\&.g\&. I/O object), the CPU parent (where the I/O object is attached) is used\&.
.PP
On input, \fCnr\fP points to the number of nodes that may be stored in the \fCnodes\fP array\&. On output, \fCnr\fP will be changed to the number of stored nodes, or the number of nodes that would have been stored if there were enough room\&.
.PP
\fBNote\fP
.RS 4
Some of these NUMA nodes may not have any memory attribute values and hence not be reported as actual targets in other functions\&.
.PP
The number of NUMA nodes in the topology (obtained by \fBhwloc_bitmap_weight()\fP on the root object nodeset) may be used to allocate the \fCnodes\fP array\&.
.PP
When an object CPU set is given as locality, for instance a Package, and when flags contain both \fBHWLOC_LOCAL_NUMANODE_FLAG_LARGER_LOCALITY\fP and \fBHWLOC_LOCAL_NUMANODE_FLAG_SMALLER_LOCALITY\fP, the returned array corresponds to the nodeset of that object\&. 
.RE
.PP

.SS "int hwloc_memattr_get_best_initiator (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target, unsigned long flags, struct \fBhwloc_location\fP * best_initiator, hwloc_uint64_t * value)"

.PP
Return the best initiator for the given attribute and target NUMA node\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), \fC-1\fP is returned and \fCerrno\fP is set to \fCEINVAL\fP\&.
.PP
If \fCvalue\fP is non \fCNULL\fP, the corresponding value is returned there\&.
.PP
If multiple initiators have the same attribute values, only one is returned (and there is no way to clarify how that one is chosen)\&. Applications that want to detect initiators with identical/similar values, or that want to look at values for multiple attributes, should rather get all values using \fBhwloc_memattr_get_value()\fP and manually select the initiator they consider the best\&.
.PP
The returned initiator should not be modified or freed, it belongs to the topology\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
If there are no matching initiators, \fC-1\fP is returned with \fCerrno\fP set to \fCENOENT\fP; 
.SS "int hwloc_memattr_get_best_target (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, struct \fBhwloc_location\fP * initiator, unsigned long flags, \fBhwloc_obj_t\fP * best_target, hwloc_uint64_t * value)"

.PP
Return the best target NUMA node for the given attribute and initiator\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), location \fCinitiator\fP is ignored and may be \fCNULL\fP\&.
.PP
If \fCvalue\fP is non \fCNULL\fP, the corresponding value is returned there\&.
.PP
If multiple targets have the same attribute values, only one is returned (and there is no way to clarify how that one is chosen)\&. Applications that want to detect targets with identical/similar values, or that want to look at values for multiple attributes, should rather get all values using \fBhwloc_memattr_get_value()\fP and manually select the target they consider the best\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
If there are no matching targets, \fC-1\fP is returned with \fCerrno\fP set to \fCENOENT\fP;
.PP
\fBNote\fP
.RS 4
The initiator \fCinitiator\fP should be of type \fBHWLOC_LOCATION_TYPE_CPUSET\fP when refering to accesses performed by CPU cores\&. \fBHWLOC_LOCATION_TYPE_OBJECT\fP is currently unused internally by hwloc, but users may for instance use it to provide custom information about host memory accesses performed by GPUs\&. 
.RE
.PP

.SS "int hwloc_memattr_get_by_name (\fBhwloc_topology_t\fP topology, const char * name, \fBhwloc_memattr_id_t\fP * id)"

.PP
Return the identifier of the memory attribute with the given name\&. 
.SS "int hwloc_memattr_get_value (\fBhwloc_topology_t\fP topology, \fBhwloc_memattr_id_t\fP attribute, \fBhwloc_obj_t\fP target_node, struct \fBhwloc_location\fP * initiator, unsigned long flags, hwloc_uint64_t * value)"

.PP
Return an attribute value for a specific target NUMA node\&. If the attribute does not relate to a specific initiator (it does not have the flag \fBHWLOC_MEMATTR_FLAG_NEED_INITIATOR\fP), location \fCinitiator\fP is ignored and may be \fCNULL\fP\&.
.PP
\fCflags\fP must be \fC0\fP for now\&.
.PP
\fBNote\fP
.RS 4
The initiator \fCinitiator\fP should be of type \fBHWLOC_LOCATION_TYPE_CPUSET\fP when refering to accesses performed by CPU cores\&. \fBHWLOC_LOCATION_TYPE_OBJECT\fP is currently unused internally by hwloc, but users may for instance use it to provide custom information about host memory accesses performed by GPUs\&. 
.RE
.PP

.SH "Author"
.PP 
Generated automatically by Doxygen for Hardware Locality (hwloc) from the source code\&.