File: rest.shtml

package info (click to toggle)
slurm-wlm 25.11.2-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 46,824 kB
  • sloc: ansic: 551,323; sh: 10,228; javascript: 6,528; makefile: 4,277; perl: 3,717; python: 559; pascal: 131
file content (444 lines) | stat: -rw-r--r-- 17,841 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
<!--#include virtual="header.txt"-->

<h1>REST API Details</h1>

<p>Slurm provides a <a href="https://restfulapi.net/">REST API</a> through the
slurmrestd daemon, using <a href="jwt.html">JSON Web Tokens</a> for
authentication. This daemon is designed to allow clients to communicate with
Slurm via a REST API (in addition to the command line interface (CLI) or C API).
</p>

<p>See also:
<ul>
<li><a href="rest_quickstart.html">REST API Quick Start Guide</a>
<ul>
<li><a href="rest_quickstart.html#common_issues">Common Issues</a></li>
</ul></li>
<li><a href="rest_api.html">REST API Methods and Models</a></li>
<li><a href="slurmrestd.html">slurmrestd man page</a></li>
<li><a href="openapi_release_notes.html">OpenAPI Plugin Release Notes</a></li>
<li><a href="rest_clients.html">REST API Client Guide</a></li>
</ul>
</p>

<h2 id="contents">Contents<a class="slurm_link" href="#contents"></a></h2>
<ul>
<li><a href="#stateless">Stateless</a></li>
<li><a href="#run_modes">Run modes</a>
<ul>
<li><a href="#inet">Inet Service Mode</a>
<li><a href="#listen">Listening Mode</a>
</ul>
</li>
<li><a href="#config">Configuration</a></li>
<li><a href="#plugins">Plugins</a></li>
<li><a href="#high_availability">High Availability</a></li>
<li><a href="#security">Security</a>
<ul>
<li><a href="#jwt">JSON Web Token (JWT) Authentication</a></li>
<li><a href="#local_auth">Local Authentication</a></li>
<li><a href="#auth_proxy">Authenticating Proxy</a></li>
</ul>
<li><a href="#python-guide">Python Guide</a>
<ul>
<li><a href="#python-setup">Setup</a></li>
<li><a href="#python-usage-overview">Usage Overview</a></li>
<li><a href="#python-job-submission">Job Submission</a></li>
<li><a href="#python-entity-control">Job, Node, and Reservation Control</a></li>
<li><a href="#python-system-management">System Management</a></li>
</ul>
</li>
</li>
</ul>

<h2 id="stateless">Stateless<a class="slurm_link" href="#stateless"></a></h2>
<p>Slurmrestd is stateless as it does not cache or save any state between
requests. Each request is handled in a thread and then all of that state is
discarded. Any request to slurmrestd is completely synchronous with the
Slurm controller (slurmctld or slurmdbd) and is only considered complete once
the HTTP response code has been sent to the client. Slurmrestd will hold a
client connection open while processing a request. Slurm database commands are
committed at the end of every request, on the success of all API calls in the
request.</p>
<p>Sites are strongly encouraged to setup a caching proxy between slurmrestd
and clients to avoid having clients repeatedly call queries, causing usage to
be higher than needed (and causing lock contention) on the controller.</p>

<h2 id="run_modes">Run modes<a class="slurm_link" href="#run_modes"></a></h2>
<p>Slurmrestd currently supports two run modes: inet service mode and listening
mode.</p>

<h3 id="inet">Inet Service Mode<a class="slurm_link" href="#inet"></a></h3>
<p>The Slurmrestd daemon acts as an
<a href="https://en.wikipedia.org/wiki/Inetd">
	Inet Service
</a> treating STDIN and STDOUT as the client. This mode allows clients to use
inetd, xinetd, or systemd socket activated services and avoid the need to run a
daemon on the host at all times. This mode creates an instance for each client
and does not support reusing the same instance for different clients.</p>

<h3 id="listen">Listening Mode<a class="slurm_link" href="#listen"></a></h3>
<p>The Slurmrestd daemon acts as a full UNIX service and continuously listens
for new TCP connections. Each connection and request are independently
authenticated.</p>

<h2 id="config">Configuration<a class="slurm_link" href="#config"></a></h2>
<p>slurmrestd can be configured either by environment variables or command line
arguments. Please see the <b>doc/man/man1/slurmrestd.8</b> man page and
<a href="rest_quickstart.html#customization">REST API Quick Start Guide</a>
for details.</p>

<h2 id="plugins">Plugins<a class="slurm_link" href="#plugins"></a></h2>
<p>As of Slurm 20.11, the REST API uses plugins for authentication and
generating content. As of Slurm-21.08, the OpenAPI plugins are available
outside of slurmrestd daemon and other slurm commands may provide or accept the
latest version of the OpenAPI formatted output. This functionality is provided
on a per command basis. Please refer to the
<a href="rest_clients.html#data_parser_lifecycle">Data Parser Lifecycle</a>
documentation for the planned life cycles of versioned endpoints.
These plugins can be optionally listed or selected via command line arguments
as described in the <a href="slurmrestd.html">slurmrestd</a> documentation.</p>

<h2 id="high_availability">High Availability
<a class="slurm_link" href="#high_availability"></a></h2>

<p>Slurmrestd is agnostic to its deployment in a highly available cluster.
The daemon may be run on multiple nodes but does not provide any coordination
with other instances for load balancing or failover.
If such functionality is desired, a separate load balancer may be deployed.
The load balancer should be able to forward any required authentication
information on to the slurmrestd machines (see <a href="#security">Security</a>
section).</p>

<p>The number of connections allowed by the slurmrestd system(s) should also be
limited so that the slurmctld is not overwhelmed with requests. Pay attention to
the <code>-t &lt;THREAD COUNT&gt;</code> and
<code>--max-connections &lt;count&gt;</code> options to <b>slurmrestd</b>, the
number of nodes deployed, and the specs of the machine running <b>slurmctld</b>.
</p>

<h2 id="security">Security<a class="slurm_link" href="#security"></a></h2>
<p>The Slurm REST API is written to provide the necessary functionality for
clients to control Slurm using REST commands. It is <b>not</b> designed to be
directly internet facing. Only unencrypted and uncompressed HTTP communications
are supported. Slurmrestd also has no protection against man in the middle or
replay attacks. Slurmrestd should only be placed in a trusted network that will
communicate with a trusted client.</p>

<p>Any site wishing to expose Slurm REST API to the internet or outside of the
cluster should at the very least use a proxy to wrap all communications with
TLS v1.3 (or later). You should also add monitoring to reject any client who
repeatedly attempts invalid logins at either the network perimeter firewall or
at the TLS proxy. Any client filtering that can be done via a proxy is
suggested to avoid common internet crawlers from talking to slurmrestd and
wasting system resource or even causing higher latency for valid clients.
Sites are recommended to use shorter lived JWT tokens for clients and renew
often, possibly via non-Slurm JWT generator to avoid having to enforce JWT
lifespan limits. It is also suggested that sites use an authenticating proxy
to handle all client authentication against the sites preferred Single Sign
On (SSO) provider instead of Slurm <b>scontrol</b> generated tokens. This will
prevent any unauthenticated client from connecting to slurmrestd.</p>

<p>The Slurm REST API is an HTTP server and all general possible precautions
for security of any web server should be applied. As these precautions are site
specific, it is highly recommended that you work with your site's security
group to ensure all policies are enforced at the proxy before connecting to
slurmrestd.</p>

<p>Slurm tries not to give potential attackers any hints when there are
authentication failures. This results in the client getting this rather terse
message: <code>Authentication failure</code>. When this happens, take a look at
the logs for the relevant Slurm daemon (i.e. <b>slurmdbd</b>, <b>slurmctld</b>,
or <b>slurmd</b>) for information about the actual issue.</p>

<h3 id="jwt">JSON Web Token (JWT) Authentication
<a class="slurm_link" href="#jwt"></a>
</h3>
<p>slurmrestd supports using <a href=jwt.html>JWT to authenticate users</a>.
JWT can be used to authenticate user over REST protocol.
<ul>
	<li>User Name Header: X-SLURM-USER-NAME</li>
	<li>JWT Header: X-SLURM-USER-TOKEN</li>
</ul>
SlurmUser or root can provide alternative user names to act as a proxy for the
given user. While using JWT authentication, slurmrestd should be run as a
unique, <b>unprivileged</b> user and group. Slurmrestd should be provided an
invalid SLURM_JWT environment variable at startup to activate JWT authentication.
This will allow users to provide their own JWT tokens while authenticating to
the proxy and ensuring against any possible accidental authorizations.</p>
<p>When using JWT, it is important that <u>AuthAltTypes=auth/jwt</u> be
configured in both your slurm.conf and slurmdbd.conf for slurmrestd.</p>

<h3 id="local_auth">Local Authentication
<a class="slurm_link" href="#local_auth"></a>
</h3>
<p>slurmrestd supports using UNIX domain sockets to have the kernel
authenticate local users. By default, slurmrestd will not start as root or
SlurmUser or if the user's primary group belongs to root or SlurmUser.
Slurmrestd must be located in the Munge security domain in order to function
and communicate with Slurm in local authentication mode.
</p>

<h3 id="auth_proxy">Authenticating Proxy
<a class="slurm_link" href="#auth_proxy"></a>
</h3>
<p>There is a wide array of authentication systems that a site could choose
from, if using <a href="#jwt">JWT authentication</a> doesn't meet your
requirements. An authenticating proxy is setup with a JWT token assigned to
the SlurmUser that can then be used to proxy for any user on the cluster.
This ability is only allowed for SlurmUser and the root users, all other
tokens will only work with their locally assigned users.</p>

<p>If using a third-party authenticating proxy, it is expected that it will
provide the correct HTTP headers (<b>X-SLURM-USER-NAME</b> and
<b>X-SLURM-USER-TOKEN</b>) to slurmrestd along with the user's request.</p>

<p>Slurm places no requirements on the authenticating proxy beyond its being
HTTP 1.1 compliant and that it provides the correct HTTP headers to allow
client authentication. Slurm will explicitly trust the HTTP headers provided
and has no way to verify them (beyond the proxy's trusted token
<b>X-SLURM-USER-TOKEN</b>). Any authenticating proxy will need to follow
your site's security policies and ensure that the proxied requests come from
the correct user. These requirements are standard to any authenticated
proxy and are not Slurm specific.</p>

<p>A working trivial example can be found in an <a
href="https://gitlab.com/SchedMD/training/docker-scale-out/-/tree/master/proxy">
internal tool</a> used for testing and training. It uses
<a href="https://www.php.net/">PHP</a> and
<a href="https://www.nginx.com/">NGINX</a> to provide the authentication logic.
This example should only be used as a basic starting place as it is not suitable
for deployment in a production environment.</p>

<h2 id="python-guide">Python Guide
<a class="slurm_link" href="#python-guide"></a>
</h2>

<p>
OpenAPI tools can be used to generate a Python client to interact with the REST
API. The examples below are for version 0.0.43 of the API, so there will be some
differences with other versions.
</p>

<h3 id="python-setup">Setup
<a class="slurm_link" href="#python-setup"></a>
</h3>

<ol>
<li>Install <a href="https://openapi-generator.tech/docs/installation/">
openapi-generator-cli</a></li>

<li>Compile the client library:
<pre>
slurmrestd --generate-openapi-spec &gt; openapi.json
openapi-generator-cli generate -i openapi.json -g python -o py_api_client
</pre>
</li>

<li>(Optional, though recommended) Initialize and activate a Python virtual
environment.</li>

<li>Install the required packages:
<pre>
cd py_api_client/
pip install -r requirements.txt
</pre>
</li>

<li>Set up the Python script. These initial lines should be used for all
subsequent examples, and assumes you have the 'SLURM_JWT' environment
variable set to a valid token:
<pre>
import os
import time
from openapi_client import SlurmApi
from openapi_client import SlurmdbApi
from openapi_client import ApiClient as Client
from openapi_client import Configuration as Config

c = Config()
c.host = "http://localhost:8080/"
c.access_token = os.getenv("SLURM_JWT")
if not c.access_token:
	raise KeyError("No SLURM_JWT set")
slurm = SlurmApi(Client(c))
slurmdb = SlurmdbApi(Client(c))

# Location of 'srun' binary + other relevant binaries in your slurm scripts
environment=['PATH=/bin/:/sbin/:/home/slurm/bin/:/home/slurm/sbin/']
curr_dir = '/tmp'
</pre>
</li>
</ol>

<h3 id="python-usage-overview">Usage Overview
<a class="slurm_link" href="#python-usage-overview"></a>
</h3>

<p>
Once set up, you can use the <code>openapi_client</code> module to access
classes and functions corresponding to the models and methods in the REST API.
See below for examples and note the following naming conventions for converting
between the REST API and the Python client:
</p>

<ul>
<li>API model: <code>v0.0.43_job_desc_msg</code>
<br>Corresponding Python class: <code>V0043JobDescMsg</code>
</li>
<li>API method: <code>POST /slurm/v0.0.43/job/submit</code>
<br>Corresponding Python function: <code>slurm_v0043_post_job_submit()</code>
</li>
</ul>

<p>
If you encounter any errors, check the common issues on the
<a href="rest_quickstart.html#common_issues">REST Quickstart</a> page.
</p>

<h3 id="python-job-submission">Job Submission
<a class="slurm_link" href="#python-job-submission"></a>
</h3>

<p>
This example shows how to populate a job submit request and
job description message with desired submission parameters. It also
illustrates how to send a POST request to submit the job.
</p>

<pre>
from openapi_client import V0043JobSubmitReq
from openapi_client import V0043JobDescMsg

# Populate a job submit request and job description message with desired parameters
my_job = V0043JobSubmitReq(script='#!/bin/bash\nsrun sleep 300',
        job=V0043JobDescMsg(
        	name='rest_test',
		partition='gpu',
		tres_per_job='gres:gpu:amd:4',
		time_limit={"set": True, "number": 5},
		required_nodes=["n2", "n4"],
		tasks=5,
        	environment=environment,
        	current_working_directory=curr_dir
	)
)

# Send POST request to submit the job
submit_response = slurm.slurm_v0043_post_job_submit(my_job)
</pre>

<h3 id="python-entity-control">Job, Node, and Reservation Control
<a class="slurm_link" href="#python-entity-control"></a>
</h3>

<p>
Jobs, nodes, and reservations can be managed through the Python client in
similar ways. Each entity requires its own imports, and each has similar
functions for viewing, modifying, and deleting. The GET functions and some of
the POST/DELETE functions can also be used in the <b>plural</b> form, for
example <code>slurm_v0043_get_jobs()</code>, to affect more than one entity.
The relevant imports and functions are listed below.
</p>

<ul>
<li><b>Job Control</b>
  <ul>
    <li>Imports: <code>V0043JobSubmitReq</code>,
      <code>V0043JobDescMsg</code></li>
    <li>View: <code>slurm_v0043_get_job()</code>
    <li>Add (submit): <code>slurm_v0043_post_job_submit()</code></li>
    <li>Modify: <code>slurm_v0043_post_job()</code></li>
    <li>Delete (cancel): <code>slurm_v0043_delete_job()</code>
  </ul>
</li>
<li><b>Node Control</b>
  <ul>
    <li>Imports: <code>V0043UpdateNodeMsg</code></li>
    <li>View: <code>slurm_v0043_get_node()</code>
    <li>Add (create): <b>N/A</b></li>
    <li>Modify: <code>slurm_v0043_post_node()</code>
    <li>Delete: <code>slurm_v0043_delete_node()</code></li>
  </ul>
</li>
<li><b>Reservation Control</b>
  <ul>
    <li>Imports: <code>V0043ReservationDescMsg</code></li>
    <li>View: <code>slurm_v0043_get_reservation()</code>
    <li>Add (create): <code>slurm_v0043_post_reservation()</code>
    <li>Modify: <code>slurm_v0043_post_reservation()</code>
    <li>Delete: <code>slurm_v0043_delete_reservation()</code></li>
  </ul>
</li>
</ul>

<p>
Here is an example for viewing, deleting, adding, and modifying reservations:
</p>

<pre>
from openapi_client import V0043ReservationDescMsg

# GET request to query reservations
resp = slurm.slurm_v0043_get_reservations()

# Examine output of GET request
if "important_jobs" in [resv.name for resv in resp.reservations]:
	resp = slurm.slurm_v0043_delete_reservation("important_jobs")

# POST request to create a reservation with the desired parameters and flags
slurm.slurm_v0043_post_reservation(
	V0043ReservationDescMsg(
		name="important_jobs",
		duration={"set": True, "number": 15},
		node_list=["n4", "n5"],
		start_time={"set": True, "number": int(time.time())},
		users=["slurm"],
		flags=["IGNORE_JOBS", "MAGNETIC", "DAILY"],
	)
)

# POST request to modify the reservation
slurm.slurm_v0043_post_reservation(
	V0043ReservationDescMsg(
		name="important_jobs",
		duration={"set": True, "number": 20},
	)
)
</pre>

<h3 id="python-system-management">System Management
<a class="slurm_link" href="#python-system-management"></a>
</h3>

<p>
A system reconfigure can be initiated with the function
<code>slurm.slurm_v0043_get_reconfigure()</code>. System information can also be
viewed with the following API functions:
</p>

<ul>
<li><code>slurm.slurm_v0043_get_partitions()</code></li>
<li><code>slurm.slurm_v0043_get_diag()</code></li>
<li><code>slurm.slurm_v0043_get_licenses()</code></li>
</ul>

<p>Here is an example for viewing partition info:</p>

<pre>
# GET request to query partitions
resp = slurm.slurm_v0043_get_partitions()

# Examine request output to filter on a specific partition QOS
qos_parts = [part for part in resp.partitions if 'sample' == part.qos.assigned]

# GET request to query partitions with a specific name
defq = slurm.slurm_v0043_get_partition("defq")

# Examine request output to grab the nodes on a partition
configured_nodes = defq.partitions[0].nodes.configured
</pre>

<hr size=4 width="100%">

<!--#include virtual="footer.txt"-->