File: slurmctld.8

package info (click to toggle)
slurm-wlm 22.05.8-4%2Bdeb12u3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 48,492 kB
  • sloc: ansic: 475,246; exp: 69,020; sh: 8,862; javascript: 6,528; python: 6,444; makefile: 4,185; perl: 4,069; pascal: 131
file content (187 lines) | stat: -rw-r--r-- 5,719 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
.TH slurmctld "8" "Slurm Daemon" "August 2022" "Slurm Daemon"

.SH "NAME"
slurmctld \- The central management daemon of Slurm.
.SH "SYNOPSIS"
\fBslurmctld\fR [\fIOPTIONS\fR...]
.SH "DESCRIPTION"
\fBslurmctld\fR is the central management daemon of Slurm. It monitors
all other Slurm daemons and resources, accepts work (jobs), and allocates
resources to those jobs. Given the critical functionality of \fBslurmctld\fR,
there may be a backup server to assume these functions in the event that
the primary server fails.

.SH "OPTIONS"

.TP
\fB\-c\fR
Clear all previous \fBslurmctld\fR state from its last checkpoint.
With this option, all jobs, including both running and queued, and all
node states, will be deleted.  Without this option, previously running
jobs will be preserved along with node \fIState\fR of DOWN, DRAINED
and DRAINING nodes and the associated \fIReason\fR field for those nodes.
NOTE: It is rare you would ever want to use this in production as all
jobs will be killed.
.IP

.TP
\fB\-d\fR
Run \fBslurmctld\fR in the background.
.IP

.TP
\fB\-D\fR
Run \fBslurmctld\fR in the foreground with logging copied to stdout.
.IP

.TP
\fB\-f <file>\fR
Read configuration from the specified file. See \fBNOTES\fR below.
.IP

.TP
\fB\-h\fR
Help; print a brief summary of command options.
.IP

.TP
\fB\-i\fR
Ignore errors found while reading in state files on startup.
Warning: Use of this option will mean losing the data that wasn't recovered
from the state files.
.IP

.TP
\fB\-L <file>\fR
Write log messages to the specified file.
.IP

.TP
\fB\-n <value>\fR
Set the daemon's nice value to the specified value, typically a negative number.
.IP

.TP
\fB\-r\fR
Recover partial state from last checkpoint: jobs and node DOWN/DRAIN
state and reason information state.  No partition state is recovered.
This is the default action.
.IP

.TP
\fB\-R\fR
Recover full state from last checkpoint: jobs, node, and partition state.
Without this option, previously running jobs will be preserved along
with node \fIState\fR of DOWN, DRAINED and DRAINING nodes and the associated
\fIReason\fR field for those nodes. No other node or partition state will
be preserved.
.IP

.TP
\fB\-s\fR
Change working directory of slurmctld to SlurmctldLogFile path if possible, or
to SlurmStateSaveLocation otherwise. If both of them fail it will fallback to
/var/tmp.
.IP

.TP
\fB\-v\fR
Verbose operation. Multiple \fB\-v\fR's increase verbosity.
.IP

.TP
\fB\-V\fR
Print version information and exit.
.IP

.SH "ENVIRONMENT VARIABLES"
The following environment variables can be used to override settings
compiled into slurmctld.

.TP 20
\fBSLURM_CONF\fR
The location of the Slurm configuration file. This is overridden by
explicitly naming a configuration file on the command line.
.IP

.TP
\fBSLURM_DEBUG_FLAGS\fR
Specify debug flags for the scheduler to use. See DebugFlags in the
\fBslurm.conf\fR(5) man page for a full list of flags. The environment
variable takes precedence over the setting in the slurm.conf.
.IP

.SH "CORE FILE LOCATION"
If slurmctld is started with the \fB\-D\fR option then the core file will be
written to the current working directory.
Otherwise if \fBSlurmctldLogFile\fR is a fully qualified path name (starting
with a slash), the core file will be written to the same directory as the
log file, provided SlurmUser has write permission on the directory.
Otherwise the core file will be written to the \fBStateSaveLocation\fR,
or "/var/tmp/" as a last resort. If none of the above directories have
write permission for SlurmUser, no core file will be produced.
The command "scontrol abort" can be used to abort the slurmctld daemon and
generate a core file.

.SH "SIGNALS"

.TP
\fBSIGTERM SIGINT\fR
\fBslurmctld\fR will shutdown cleanly, saving its current state to the state
save directory.
.IP

.TP
\fBSIGABRT\fR
\fBslurmctld\fR will shutdown cleanly, saving its current state, and perform a
core dump.
.IP

.TP
\fBSIGHUP\fR
Reloads the slurm configuration files, similar to 'scontrol reconfigure'.
.IP

.TP
\fBSIGUSR2\fR
Reread the log level from the configs, and then reopen the log file.  This
should be used when setting up \fBlogrotate\fR(8).
.IP

.TP
\fBSIGCHLD SIGUSR1 SIGTSTP SIGXCPU SIGQUIT SIGPIPE SIGALRM\fR
These signals are explicitly ignored.
.IP

.SH "NOTES"
It may be useful to experiment with different \fBslurmctld\fR specific
configuration parameters using a distinct configuration file
(e.g. timeouts).  However, this special configuration file will not be
used by the \fBslurmd\fR daemon or the Slurm programs, unless you
specifically tell each of them to use it. If you desire changing
communication ports, the location of the temporary file system, or
other parameters used by other Slurm components, change the common
configuration file, \fBslurm.conf\fR.

.SH "COPYING"
Copyright (C) 2002\-2007 The Regents of the University of California.
Copyright (C) 2008\-2010 Lawrence Livermore National Security.
Copyright (C) 2010\-2022 SchedMD LLC.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
CODE\-OCEC\-09\-009. All rights reserved.
.LP
This file is part of Slurm, a resource management program.
For details, see <https://slurm.schedmd.com/>.
.LP
Slurm is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
.LP
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.

.SH "SEE ALSO"
\fBslurm.conf\fR(5), \fBslurmd\fR(8)