1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340
|
<!--#include virtual="header.txt"-->
<h1><a name="top">Slurm Plugin API</a></h1>
<h2 id="overview">Overview<a class="slurm_link" href="#overview"></a></h2>
<p>A Slurm plugin is a dynamically linked code object which is loaded explicitly
at run time by the Slurm libraries. A plugin provides a customized implementation
of a well-defined API connected to tasks such as authentication, interconnect
fabric, and task scheduling.</p>
<h2 id="identification">Identification
<a class="slurm_link" href="#identification"></a></h2>
<p>A Slurm plugin identifies itself by a short character string formatted similarly
to a MIME type: <i><major>/<minor></i>. The major type identifies
which API the plugin implements. The minor type uniquely distinguishes a plugin
from other plugins that implement that same API, by such means as the intended
platform or the internal algorithm. For example, a plugin to interface to the
Maui scheduler would give its type as "sched/maui." It would implement
the Slurm Scheduler API.</p>
<h2 id="versioning">Versioning
<a class="slurm_link" href="#versioning"></a>
</h2>
<p>Slurm plugin version numbers comprise a major, minor and micro revision number.
If the major and/or minor revision number changes, this indicates major changes
to the Slurm functionality including changes to APIs, command options, and
plugins.
These plugin changes may include new functions and/or function arguments.
If only the micro revision number changes, this is indicative of bug fixes
and possibly minor enhancements which should not adversely impact users.
In all cases, rebuilding and installing all Slurm plugins is recommended
at upgrade time.
Not all compute nodes in a cluster need be updated at the same time, but
all Slurm APIs, commands, plugins, etc. on a compute node should represent
the same version of Slurm.</p>
<h2 id="data_objects">Data Objects
<a class="slurm_link" href="#data_objects"></a></h2>
<p>A plugin must define and export the following symbols:</p>
<ul>
<li><span class="commandline">char plugin_type[]</span><br>
A unique, short, formatted string to identify the plugin's purpose as
described above. A "null" plugin (i.e., one that implements the desired
API as stubs) should have a minor type of "none."</li>
<li><span class="commandline">char plugin_name[]</span><br>
A free-form string that identifies the plugin in human-readable terms,
such as "Kerberos authentication." Slurm will use this string to identify
the plugin to end users.</li>
<li><span class="commandline">const uint32_t plugin_version</span><br>
Identifies the version of Slurm used to build this plugin and
any attempt to load the plugin from a different version of Slurm will result
in an error.
The micro version is not considered for <a href="spank.html">SPANK</a> plugins.
</li></ul>
<h2 id="api">API Functions in All Plugins
<a class="slurm_link" href="#api"></a>
</h2>
<p class="commandline">int init (void);</p>
<p style="margin-left:.2in"><b>Description</b>: If present, this function is called
just after the plugin is loaded. This allows the plugin to perform any global
initialization prior to any actual API calls.</p>
<p style="margin-left:.2in"><b>Arguments</b>: None.</p>
<p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if the plugin's initialization
was successful. Any other return value indicates to Slurm that the plugin should
be unloaded and not used.</p>
<p class="commandline">void fini (void);</p>
<p style="margin-left:.2in"><b>Description</b>: If present, this function is called
just before the plugin is unloaded. This allows the plugin to do any finalization
after the last plugin-specific API call is made.</p>
<p style="margin-left:.2in"><b>Arguments</b>: None.</p>
<p style="margin-left:.2in"><b>Returns</b>: None.</p>
<p><b>Note</b>: These init and fini functions are not the same as those
described in the <span class="commandline">dlopen (3)</span> system library.
The C run-time system co-opts those symbols for its own initialization.
The system <span class="commandline">_init()</span> is called before the Slurm
<span class="commandline">init()</span>, and the Slurm
<span class="commandline">fini()</span> is called before the system's
<span class="commandline">_fini()</span>.</p>
<p>The functions need not appear. The plugin may provide either
<span class="commandline">init()</span> or <span class="commandline">fini()</span> or both.</p>
<h2 id="thread_safety">Thread Safety
<a class="slurm_link" href="#thread_safety"></a>
</h2>
<p>Slurm is a multithreaded application. The Slurm plugin library may exercise
the plugin functions in a re-entrant fashion. It is the responsibility of the
plugin author to provide the necessarily mutual exclusion and synchronization
in order to avoid the pitfalls of re-entrant code.</p>
<h2 id="run_time">Run-time Support
<a class="slurm_link" href="#run_time"></a>
</h2>
<p>The standard system libraries are available to the plugin. The Slurm libraries
are also available and plugin authors are encouraged to make use of them rather
than develop their own substitutes. Plugins should use the Slurm log to print
error messages.</p>
<p>The plugin author is responsible for specifying any specific non-standard libraries
needed for correct operation. Plugins will not load if their dependent libraries
are not available, so it is the installer's job to make sure the specified libraries
are available.</p>
<h2 id="performance">Performance
<a class="slurm_link" href="#performance"></a>
</h2>
<p>All plugin functions are expected to execute very quickly. If any function
entails delays (e.g. transactions with other systems), it should be written to
utilize a thread for that functionality. This thread may be created by the
<span class="commandline">init()</span> function and deleted by the
<span class="commandline">fini()</span> functions. See <b>plugins/sched/backfill</b>
for an example of how to do this.</p>
<h2 id="structure">Data Structure Consistency
<a class="slurm_link" href="#structure"></a>
</h2>
<p>
In certain situations Slurm iterates over different data structures elements
using counters. For example, with environment variable arrays.
In order to avoid buffer overflows and other undesired situations, when a
plugin modifies certain elements it must also update these counters accordingly.
Other situations may require other types of changes.
</p>
<p>
The following advice indicates which structures have arrays with associated
counters that must be maintained when modifying data, plus other possible
important information to take in consideration when manipulating these
structures.
This list is not fully exhaustive due to constant modifications in code,
but it is a first start point and basic guideline for most common situations.
Complete structure information can be seen in the <i>slurm/slurm.h.in</i>
file.
</p>
<h3 id="slurm_job_info_t">slurm_job_info_t (job_info_t) Data Structure
<a class="slurm_link" href="#slurm_job_info_t"></a>
</h3>
<pre>
uint32_t env_size;
char **environment;
uint32_t spank_job_env_size;
char **spank_job_env;
uint32_t gres_detail_cnt;
char **gres_detail_str;
</pre>
<p>
These pairs of array pointers and element counters must kept updated in order
to avoid subsequent buffer overflows, so if you update the array you must
also update the related counter.
</p>
<pre>
char *nodes;
int32_t *node_inx;
int32_t *req_node_inx;
char *req_nodes;
</pre>
<p>
<i>node_inx</i> and <i>req_node_inx</i> represents a list of index pairs for
ranges of nodes defined in the <i>nodes</i> and <i>req_nodes</i> fields
respectively. In each case, both array variables must match the count.
</p>
<pre>
uint32_t het_job_id;
char *het_job_id_set;
</pre>
<p>
The <i>het_job_id</i> field should be the first element of the
<i>het_job_id_set</i> array.
</p>
<h3 id="job_step_info_t">job_step_info_t Data Structure
<a class="slurm_link" href="#job_step_info_t"></a>
</h3>
<pre>
char *nodes;
int32_t *node_inx;
</pre>
<p>
<i>node_inx</i> represents a list of index pairs for range of nodes defined in
<i>nodes</i>. Both variables must match the node count.
</p>
<h3 id="priority_factors_object_t">priority_factors_object_t Data Structure
<a class="slurm_link" href="#priority_factors_object_t"></a>
</h3>
<pre>
uint32_t tres_cnt;
char **tres_names;
double *tres_weights;
</pre>
<p>
This value must match the configured TRES on the system, otherwise
iteration over the <i>tres_names</i> or <i>tres_weights</i> arrays can cause
buffer overflows.
</p>
<h3 id="job_step_pids_t">job_step_pids_t Data Structure
<a class="slurm_link" href="#job_step_pids_t"></a>
</h3>
<pre>
uint32_t pid_cnt;
uint32_t *pid;
</pre>
<p>
Array <i>pid</i> represents the list of Process IDs for the job step, and
<i>pid_cnt</i> is the counter that must match the size of the array.
</p>
<h3 id="slurm_step_layout_t">slurm_step_layout_t Data Structure
<a class="slurm_link" href="#slurm_step_layout_t"></a>
</h3>
<pre>
uint32_t node_cnt;
char *node_list;
</pre>
<p>
The <i>node_list</i> array size must match <i>node_cnt</i>.
</p>
<pre>
uint16_t *tasks;
uint32_t node_cnt;
uint32_t task_cnt;
</pre>
<p>
In the <i>tasks</i> array, each element is the number of tasks assigned
to the corresponding node, to its size must match <i>node_cnt</i>. Moreover
<i>task_cnt</i> represents the sum of tasks registered in <i>tasks</i>.
</p>
<pre>
uint32_t **tids;
</pre>
<p>
<i>tids</i> is an array of length <i>node_cnt</i> of task ID arrays. Each
subarray is designated by the corresponding value in the <i>tasks</i> array,
so <i>tasks</i>, <i>tids</i> and <i>task_cnt</i> must be set to match this
layout.
</p>
<h3 id="slurm_step_launch_params_t">slurm_step_launch_params_t Data Structure
<a class="slurm_link" href="#slurm_step_launch_params_t"></a>
</h3>
<pre>
uint32_t envc;
char **env;
</pre>
<p>
When modifying the environment variables in the <i>env</i> array, you must
also modify the <i>envc</i> counter accordingly to prevent buffer overflows
in subsequent loops over that array.
</p>
<pre>
uint32_t het_job_nnodes;
uint32_t het_job_ntasks;
uint16_t *het_job_task_cnts;
uint32_t **het_job_tids;
uint32_t *het_job_node_list;
</pre>
<p>
This <i>het_job_*</i> related variables must match the current heterogeneous
job configuration.
<br>
For example, if for whatever reason you are reducing the number of tasks for
a node in a heterogeneous job, you should at least remove that task ID from
<i>het_job_tids</i>, decrement <i>het_job_ntasks</i> and
<i>het_job_task_cnts</i>, and possibly decrement the number of nodes of the
heterogeneous job in <i>het_job_nnodes</i> and <i>het_job_node_list</i>.
</p>
<pre>
char **spank_job_env;
uint32_t spank_job_env_size;
</pre>
<p>
When modifying the <i>spank_job_env</i> structure, the
<i>spank_job_env_size</i> field must be updated to prevent buffer overflows
in subsequent loops over that array.
</p>
<h3 id="node_info_t">node_info_t Data Structure
<a class="slurm_link" href="#node_info_t"></a>
</h3>
<pre>
char *features;
char *features_act;
</pre>
<p>
In a system containing Intel KNL processors the <i>features_act</i> field is
set by the plugin to match the currently running modes on the node. On other
systems the <i>features_act</i> is not usually used.
If you program such a plugin you must ensure that <i>features_act</i> contains
a subset of <i>features</i>.
</p>
<pre>
char *reason;
time_t reason_time;
uint32_t reason_uid;
</pre>
<p>
If <i>reason</i> is modified then <i>reason_time</i> and <i>reason_uid</i>
should be updated.
</p>
<h3 id="reserve_info_t">reserve_info_t Data Structure
<a class="slurm_link" href="#reserve_info_t"></a>
</h3>
<pre>
int32_t *node_inx;
uint32_t node_cnt;
</pre>
<p>
<i>node_inx</i> represents a list of index pairs for range of nodes associated
with the reservation and its count must equal <i>node_cnt</i>.
</p>
<h3 id="partition_info_t">partition_info_t Data Structure
<a class="slurm_link" href="#partition_info_t"></a>
</h3>
<p>
No special advice.
</p>
<h3 id="slurm_step_layout_req_t">slurm_step_layout_req_t Data Structure
<a class="slurm_link" href="#slurm_step_layout_req_t"></a>
</h3>
<p>
No special advice.
</p>
<h3 id="slurm_step_ctx_params_t">slurm_step_ctx_params_t
<a class="slurm_link" href="#slurm_step_ctx_params_t"></a>
</h3>
<p>
No special advice.
</p>
<p style="text-align:center;">Last modified 25 August 2022</p>
<!--#include virtual="footer.txt"-->
|