1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
|
.TH "job_container.conf" "5" "Slurm Configuration File" "April 2025" "Slurm Configuration File"
.SH "NAME"
job_container.conf \- Slurm configuration file for job_container/tmpfs plugin
.SH "DESCRIPTION"
\fBjob_container.conf\fP is an ASCII file which defines parameters used by
Slurm's job_container/tmpfs plugin. The plugin reads the
job_container.conf file to find out the configuration settings. Based on them it
constructs a private (or optionally shared) filesystem namespace for the job and
mounts a list of directories (defaults to /tmp and /dev/shm) inside it. This
gives the job a private view of these directories. These paths are mounted
inside the location specified by 'BasePath' in the \fBjob_container.conf\fR
file. When the job completes, the private namespace is unmounted and all
files therein are automatically removed.
To make use of this plugin, 'PrologFlags=Contain' must also be present in
your \fBslurm.conf\fR file, as shown:
.nf
JobContainerType=job_container/tmpfs
PrologFlags=Contain
.fi
The file will always be located in the same directory as the \fBslurm.conf\fR.
.LP
If using the \fBjob_container.conf\fR file to define a namespace available to
nodes the first parameter on the line should be \fBNodeName\fR. If configuring a
namespace without specifying nodes, the first parameter on the line
should be \fBBasePath\fR.
.LP
Parameter names are case insensitive.
Any text following a "#" in the configuration file is treated
as a comment through the end of that line.
Changes to the configuration file take effect upon restart of Slurm daemons.
.LP
The following job_container.conf parameters are defined to control the behavior
of the job_container/tmpfs plugin.
.TP
\fBAutoBasePath\fR
This determines if plugin should create the BasePath directory or not. Set it
to 'true' if directory is not pre\-created before slurm startup. If set to true,
the directory is created with permission 0755. Directory is not deleted during
slurm shutdown. If set to 'false' or not specified, plugin would expect
directory to exist. This option can be used on a global or per\-line basis.
This parameter is optional.
.IP
.TP
\fBBasePath\fR
Specify the \fIPATH\fR that the tmpfs plugin should use as a base to mount the
private directories. This path must be readable and writable by the plugin. The
plugin constructs a directory for each job inside this path, which is then used
for mounting. The \fBBasePath\fR gets mounted as 'private' during slurmd start
and remains mounted until shutdown. The first "%h" within the name is replaced
with the hostname on which the \fBslurmd\fR is running. The first "%n" within
the name is replaced with the Slurm node name on which the \fBslurmd\fR is
running. Set \fIPATH\fR to 'none' to disable the tmpfs plugin on node subsets
when there is a global setting.
\fBNOTE\fR: The \fBBasePath\fR must be unique to each node. If BasePath is on a
shared filesystem, you can use "%h" or "%n" to create node-unique directories.
\fBNOTE\fR: The \fBBasePath\fR parameter cannot be set to any of
the paths specified by \fBDirs\fR. Using these directories will cause conflicts
when trying to mount and unmount the private directories for the job.
.IP
.TP
\fBCloneNSScript\fR
Specify fully qualified pathname of an optional initialization script. This
script is run fter the namespace construction of a job. This script will be
provided the SLURM_NS environment variable to allow the script to join the
newly created namespace and do further setup work. This parameter is optional.
.IP
.TP
\fBCloneNSScript_Wait\fR
The number of seconds to wait for the CloneNSScript to complete before
considering the script failed. The default value is 10 seconds.
.IP
.TP
\fBCloneNSEpilog\fR
Specify fully qualified pathname of an optional epilog script. This script runs
just before the namespace is torn down. This script will be provided the
SLURM_NS environment variable to allow the script to join the soon to be
removed namespace and do any cleanup work. This parameter is optional.
.IP
.TP
\fBCloneNSEpilog_Wait\fR
The number of seconds to wait for the CloneNSEpilog to complete before
considering the script failed. The default value is 10 seconds.
.IP
.TP
\fBDirs\fR
A list of mount points separated with commas to create private mounts for.
This parameter is optional and if not specified it defaults to "/tmp,/dev/shm".
\fBNOTE\fR: /dev/shm has special handling, and instead of a bind mount is always
a fresh tmpfs filesystem.
.IP
.TP
\fBEntireStepInNS\fR
Specifying EntireStepInNS=true will pivot all slurmstepd processes (excluding
the external step, which is tasked with creating the namespace) into the
constructed namespace. This will cause issues if certain paths such as
SlurmdSpoolDir are inaccessible. This parameter is optional.
.IP
.TP
\fBInitScript\fR
Specify fully qualified pathname of an optional initialization script. This
script is run before the namespace construction of a job. It can be used to
make the job join additional namespaces prior to the construction of /tmp
namespace or it can be used for any site\-specific setup. This parameter is
optional.
.IP
.TP
\fBNodeName\fR
A NodeName specification can be used to permit one job_container.conf
file to be used for all compute nodes in a cluster by specifying the node(s)
that each line should apply to.
The NodeName specification can use a Slurm hostlist specification as shown in
the example below. This parameter is optional.
.IP
.TP
\fBShared\fR
Specifying Shared=true will propagate new mounts between the job specific
filesystem namespace and the root filesystem namespace, enable using autofs on
the node. This parameter is optional.
.IP
.SH "NOTES"
.LP
If any parameters in job_container.conf are changed while slurm is running, then
slurmd on the respective nodes will need to be
restarted for changes to take effect (scontrol reconfigure is not sufficient).
Additionally this can be disruptive to
jobs already running on the node. So care must be taken to make sure no jobs
are running if any changes to job_container.conf are deployed.
Restarting slurmd is safe and non\-disruptive to running jobs, as long as
job_container.conf is not changed between restarts in which case above point
applies.
.SH "EXAMPLE"
.TP
\fB/etc/slurm/slurm.conf\fR:
These are the entries required in \fBslurm.conf\fR to activate the
job_container/tmpfs plugin.
.IP
.nf
JobContainerType=job_container/tmpfs
PrologFlags=Contain
.fi
.TP
\fB/etc/slurm/job_container.conf\fR:
The first sample file will define 1 basepath for all nodes and it will be
automatically created.
.nf
AutoBasePath=true
BasePath=/var/nvme/storage
.fi
The second sample file will define 2 basepaths.
The first will only be on largemem[1\-2] and it will be automatically created.
The second will only be on gpu[1\-10], will be expected to exist and will run
an initscript before each job.
.nf
NodeName=largemem[1\-2] AutoBasePath=true BasePath=/var/nvme/storage_a
NodeName=gpu[1\-10] BasePath=/var/nvme/storage_b InitScript=/etc/slurm/init.sh
.fi
The third sample file will Define 1 basepath that will be on all nodes,
automatically created, with /tmp and /var/tmp as private mounts.
.nf
AutoBasePath=true
BasePath=/var/nvme/storage Dirs=/tmp,/var/tmp
.fi
.IP
.SH "COPYING"
Copyright (C) 2021 Regents of the University of California
Produced at Lawrence Berkeley National Laboratory
.br
Copyright (C) 2021\-2022 SchedMD LLC.
.LP
This file is part of Slurm, a resource management program.
For details, see <https://slurm.schedmd.com/>.
.LP
Slurm is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
.LP
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details.
.SH "SEE ALSO"
.LP
\fBslurm.conf\fR(5)
|