File: condor_submit_dag.rst

package info (click to toggle)
condor 23.9.6%2Bdfsg-2.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 60,012 kB
  • sloc: cpp: 528,272; perl: 87,066; python: 42,650; ansic: 29,558; sh: 11,271; javascript: 3,479; ada: 2,319; java: 619; makefile: 615; xml: 613; awk: 268; yacc: 78; fortran: 54; csh: 24
file content (341 lines) | stat: -rw-r--r-- 16,982 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
*condor_submit_dag*
=====================

Manage and queue jobs within a specified DAG for execution on remote
machines
:index:`condor_submit_dag<single: condor_submit_dag; HTCondor commands>`\ :index:`condor_submit_dag command`

Synopsis
--------

**condor_submit_dag** [**-help | -version** ]

**condor_submit_dag** [**-no_submit** ] [**-verbose** ]
[**-force** ] [**-dagman** *DagmanExecutable*]
[**-maxidle** *NumberOfProcs*] [**-maxjobs** *NumberOfClusters*]
[**-maxpre** *NumberOfPreScripts*] [**-maxpost** *NumberOfPostScripts*]
[**-notification** *value*] [**-r** *schedd_name*]
[**-debug** *level*] [**-usedagdir** ]
[**-outfile_dir** *directory*] [**-config** *ConfigFileName*]
[**-insert_sub_file** *FileName*] [**-append** *Command*]
[**-batch-name** *batch_name*] [**-autorescue** *0|1*]
[**-dorescuefrom** *number*] [**-load_save** *filename*]
[**-allowversionmismatch** ]
[**-no_recurse** ] [**-do_recurse** ] [**-update_submit** ]
[**-import_env** ] [**-include_env** *Variables*] [**-insert_env** *Key=Value*]
[**-DumpRescue** ] [**-valgrind** ] [**-DontAlwaysRunPost** ] [**-AlwaysRunPost** ]
[**-priority** *number*] [**-SubmitMethod** *value*]
[**-schedd-daemon-ad-file** *FileName*]
[**-schedd-address-file** *FileName*] [**-suppress_notification** ]
[**-dont_suppress_notification** ] [**-DoRecovery** ]
*DAGInputFile1* [*DAGInputFile2 ... DAGInputFileN* ]

Description
-----------

*condor_submit_dag* is the program for submitting a DAG (directed
acyclic graph) of jobs for execution under HTCondor. The program
enforces the job dependencies defined in one or more *DAGInputFile*\s.
Each *DAGInputFile* contains commands to direct the submission of jobs
implied by the nodes of a DAG to HTCondor. Extensive documentation is in
the HTCondor User Manual section on DAGMan.
:index:`in DAGs<single: in DAGs; email notification>`
:index:`e-mail in DAGs<single: e-mail in DAGs; notification>`

Some options may be specified on the command line or in the
configuration or in a node job's submit description file. Precedence is
given to command line options or configuration over settings from a
submit description file. An example is e-mail notifications. When
configuration variable  :macro:`DAGMAN_SUPPRESS_NOTIFICATION` is its default value of
``True``, and a node job's submit description file contains

.. code-block:: condor-submit

      notification = Complete

e-mail will not be sent upon completion, as the value of
:macro:`DAGMAN_SUPPRESS_NOTIFICATION` is enforced.

Options
-------

 **-help**
    Display usage information and exit.
 **-version**
    Display version information and exit.
 **-no_submit**
    Produce the HTCondor submit description file for DAGMan, but do not
    submit DAGMan as an HTCondor job.
 **-verbose**
    Cause *condor_submit_dag* to give verbose error messages.
 **-force**
    Require *condor_submit_dag* to overwrite the files that it
    produces, if the files already exist. Note that ``dagman.out`` will
    be appended to, not overwritten. If rescue files exist then
    DAGMan will run the original DAG and rename the rescue files.
    Any old-style rescue files will be deleted.
 **-dagman** *DagmanExecutable*
    Allows the specification of an alternate *condor_dagman* executable
    to be used instead of the one found in the user's path. This must be
    a fully qualified path.
 **-maxidle** *NumberOfProcs*
    Sets the maximum number of idle procs allowed before
    *condor_dagman* stops submitting more node jobs. If this option is
    omitted then the number of idle procs is limited by the configuration
    variable :macro:`DAGMAN_MAX_JOBS_IDLE` which defaults to 1000.
    To disable this limit, set *NumberOfProcs* to 0. The *NumberOfProcs*
    can be exceeded if a nodes job has a queue command with more than
    one proc to queue. i.e. ``queue 500`` will submit all procs even
    if *NumberOfProcs* is ``250``. In this case DAGMan will wait for
    for the number of idle procs to fall below 250 before submitting
    more jobs to the **condor_schedd**.
 **-maxjobs** *NumberOfClusters*
    Sets the maximum number of clusters within the DAG that will be
    submitted to HTCondor at one time. Each cluster is associated with
    one node job no matter how many individual procs are in the cluster.
    *NumberOfClusters* is a non-negative integer. If this option is
    omitted then the number of clusters is limited by the configuration
    variable :macro:`DAGMAN_MAX_JOBS_SUBMITTED` which defaults to 0 (unlimited).
 **-maxpre** *NumberOfPreScripts*
    Sets the maximum number of PRE scripts within the DAG that may be
    running at one time. *NumberOfPreScripts* is a non-negative integer.
    If this option is omitted, the number of PRE scripts is limited by
    the configuration variable :macro:`DAGMAN_MAX_PRE_SCRIPTS`
    which defaults to 20.
 **-maxpost** *NumberOfPostScripts*
    Sets the maximum number of POST scripts within the DAG that may be
    running at one time. *NumberOfPostScripts* is a non-negative
    integer. If this option is omitted, the number of POST scripts is
    limited by the configuration variable :macro:`DAGMAN_MAX_POST_SCRIPTS`
    which defaults to 20.
 **-notification** *value*
    Sets the e-mail notification for DAGMan itself. This information
    will be used within the HTCondor submit description file for DAGMan.
    This file is produced by *condor_submit_dag*. See the description
    of **notification** :index:`notification<single: notification; submit commands>`
    within *condor_submit* manual page for a specification of *value*.
 **-r** *schedd_name*
    Submit *condor_dagman* to a *condor_schedd* on a remote machine.
    It is assumed that any necessary files will be present on the
    remote machine via some method like a shared filesystem between the
    local and remote machines. The user also requires the correct
    permissions to submit remotely similarly to *condor_submit*'s
    **-remote** option. If other options are desired, including
    transfer of other input files, consider using the **-no_submit**
    option and modifying the resulting submit file for specific needs
    before using *condor_submit* on the prouduced DAGMan job submit file.
 **-debug** *level*
    Passes the the *level* of debugging output desired to
    *condor_dagman*. *level* is an integer, with values of 0-7
    inclusive, where 7 is the most verbose output. See the
    *condor_dagman* manual page for detailed descriptions of these
    values. If not specified, no **-debug** *V*\alue is passed to
    *condor_dagman*.
 **-usedagdir**
    This optional argument causes *condor_dagman* to run each specified
    DAG as if *condor_submit_dag* had been run in the directory
    containing that DAG file. This option is most useful when running
    multiple DAGs in a single *condor_dagman*. Note that the
    **-usedagdir** flag must not be used when running an old-style
    Rescue DAG.
 **-outfile_dir** *directory*
    Specifies the directory in which the ``.dagman.out`` file will be
    written. The *directory* may be specified relative to the current
    working directory as *condor_submit_dag* is executed, or specified
    with an absolute path. Without this option, the ``.dagman.out`` file
    is placed in the same directory as the first DAG input file listed
    on the command line.
 **-config** *ConfigFileName*
    Specifies a configuration file to be used for this DAGMan run. This
    configuration will apply to all DAGs submitted in via DAGMan. Note
    that only one custom configuration file can be specified for a DAGMan
    workflow which will cause a failure if used in conjuntion with a
    DAG using the ``CONFIG`` command.
 **-insert_sub_file** *FileName*
    Specifies a file to insert into the ``.condor.sub`` file created by
    *condor_submit_dag*. The specified file must contain only legal
    submit file commands. Only one file can be inserted. The specified
    file will override the file set by the configuration variable
    :macro:`DAGMAN_INSERT_SUB_FILE`. The specified file is inserted
    into the ``.condor.sub`` file before the queue command and
    any commands specified with the **-append** option.
 **-append** *Command*
    Specifies a command to append to the ``.condor.sub`` file created by
    *condor_submit_dag*. The specified command is appended to the
    ``.condor.sub`` file immediately before the queue command and after
    any commands added via **-insert_sub_file** or :macro:`DAGMAN_INSERT_SUB_FILE`.
    Multiple commands are specified by using the **-append** option
    multiple times. Commands with spaces in them must be enclosed in
    double quotes.
 **-batch-name** *batch_name*
    Set the batch name for this DAG/workflow. The batch name is
    displayed by *condor_q*. If omitted DAGMan will set the batch
    name to ``DagFile+ClusterId`` where *DagFile* is the name of
    the primary DAG submitted DAGMan and *ClusterId* is the DAGMan
    proper jobs :ad-attr:`ClusterId`. The batch name is set in all jobs
    submitted by DAGMan and propagated down into sub-DAGs. Note:
    set the batch name to ' ' (space) to avoid overriding batch
    names specified in node job submit files.
 **-autorescue** *0|1*
    Whether to automatically run the newest rescue DAG for the given DAG
    file, if one exists (0 = ``false``, 1 = ``true``).
 **-dorescuefrom** *number*
    Forces *condor_dagman* to run the specified rescue DAG number for
    the given DAG. A value of 0 is the same as not specifying this
    option. Specifying a non-existent rescue DAG is a fatal error.
 **-load_save** *filename*
    Specify a file with saved DAG progress to re-run the DAG from. If
    given a path DAGMan will attempt to read that file following that
    path. Otherwise, DAGMan will check for the file in the DAG's
    ``save_files`` sub-directory.
 **-allowversionmismatch**
    This optional argument causes *condor_dagman* to allow a version
    mismatch between *condor_dagman* itself and the ``.condor.sub``
    file produced by *condor_submit_dag* (or, in other words, between
    *condor_submit_dag* and *condor_dagman*). WARNING! This option
    should be used only if absolutely necessary. Allowing version
    mismatches can cause subtle problems when running DAGs.
 **-no_recurse**
    This optional argument causes *condor_submit_dag* to not run
    itself recursively on nested DAGs (this is now the default; this
    flag has been kept mainly for backwards compatibility).
 **-do_recurse**
    This optional argument causes *condor_submit_dag* to run itself
    recursively on nested DAGs to pre-produce their ``.condor.sub``
    files. DAG nodes specified with the **SUBDAG EXTERNAL** keyword
    or with submit file names ending in ``.condor.sub`` are considered
    nested DAGs. This flag is useful when the configuration variable
    :macro:`DAGMAN_GENERATE_SUBDAG_SUBMITS` is ``False`` (Not default).
 **-update_submit**
    This optional argument causes an existing ``.condor.sub`` file to
    not be treated as an error; rather, the ``.condor.sub`` file will be
    overwritten, but the existing values of **-maxjobs**, **-maxidle**,
    **-maxpre**, and **-maxpost** will be preserved.
 **-import_env**
    This optional argument causes *condor_submit_dag* to import the
    current environment into the **environment** command of the
    ``.condor.sub`` file it generates.
 **-include_env** *Variables*
     This optional argument takes a comma separated list of enviroment
     variables to add to ``.condor.sub`` ``getenv`` environment filter
     which causes found matching environment variables to be added to
     the DAGMan manager jobs **environment**.
 **-insert_env** *Key=Value*
     This optional argument takes a delimited string of *Key=Value* pairs
     to explicitly set into the ``.condor.sub`` files :ad-attr:`Environment` macro.
     The base delimiter is a semicolon that can be overriden by setting
     the first character in the string to a valid delimiting character.
     If multiple **-insert_env** flags contain the same *Key* then the last
     occurances *Value* will be set in the DAGMan jobs **environment**.
 **-DumpRescue**
    This optional argument tells *condor_dagman* to immediately dump a
    rescue DAG and then exit, as opposed to actually running the DAG.
    This feature is mainly intended for testing. The Rescue DAG file is
    produced whether or not there are parse errors reading the original
    DAG input file. The name of the file differs if there was a parse
    error.
 **-valgrind**
    This optional argument causes the submit description file generated
    for the submission of *condor_dagman* to be modified. The
    executable becomes *valgrind* run on *condor_dagman*, with a
    specific set of arguments intended for testing *condor_dagman*.
    Note that this argument is intended for testing purposes only. Using
    the **-valgrind** option without the necessary *valgrind* software
    installed will cause the DAG to fail. If the DAG does run, it will
    run much more slowly than usual.
 **-DontAlwaysRunPost**
    This option causes the submit description file generated for the
    submission of *condor_dagman* to be modified. It causes
    *condor_dagman* to not run the POST script of a node if the PRE
    script fails.
 **-AlwaysRunPost**
    This option causes the submit description file generated for the
    submission of *condor_dagman* to be modified. It causes
    *condor_dagman* to always run the POST script of a node, even if
    the PRE script fails.
 **-priority** *number*
    Sets the minimum job priority of node jobs submitted and running
    under the *condor_dagman* job submitted by this
    *condor_submit_dag* command.
 **-schedd-daemon-ad-file** *FileName*
    Specifies a full path to a daemon ad file dropped by a
    *condor_schedd*. Therefore this allows submission to a specific
    scheduler if several are available without repeatedly querying the
    *condor_collector*. The value for this argument defaults to the
    configuration attribute :macro:`SCHEDD_DAEMON_AD_FILE`.
 **-schedd-address-file** *FileName*
    Specifies a full path to an address file dropped by a
    *condor_schedd*. Therefore this allows submission to a specific
    scheduler if several are available without repeatedly querying the
    *condor_collector*. The value for this argument defaults to the
    configuration attribute :macro:`SCHEDD_ADDRESS_FILE`.
 **-suppress_notification**
    Causes jobs submitted by *condor_dagman* to not send email
    notification for events. The same effect can be achieved by setting
    configuration variable :macro:`DAGMAN_SUPPRESS_NOTIFICATION` to ``True``. This
    command line option is independent of the **-notification** command
    line option, which controls notification for the *condor_dagman*
    job itself.
 **-dont_suppress_notification**
    Causes jobs submitted by *condor_dagman* to defer to content within
    the submit description file when deciding to send email notification
    for events. The same effect can be achieved by setting configuration
    variable :macro:`DAGMAN_SUPPRESS_NOTIFICATION` to ``False``. This
    command line flag is independent of the **-notification** command
    line option, which controls notification for the *condor_dagman*
    job itself. If both **-dont_suppress_notification** and
    **-suppress_notification** are specified with the same command
    line, the last argument is used.
 **-DoRecovery**
    Causes *condor_dagman* to start in recovery mode. This means that
    DAGMan reads the relevant ``.nodes.log`` file to restore its previous
    state of node completions and failures to continue running.
 **-SubmitMethod** *value*
    This optional argument takes an enumerated value representing the
    method in which *condor_dagman* will submit managed jobs for execution.
    Enumeration values are as follows:

    -  **0** : Run :tool:`condor_submit`
    -  **1** : Directly submit job to local *condor_schedd* queue

Exit Status
-----------

*condor_submit_dag* will exit with a status value of 0 (zero) upon
success, and it will exit with the value 1 (one) upon failure.

Examples
--------

To run a single DAG:

.. code-block:: console

    $ condor_submit_dag diamond.dag

To run a DAG when it has already been run and the output files exist:

.. code-block:: console

    $ condor_submit_dag -force diamond.dag

To run a DAG, limiting the number of idle node jobs in the DAG to a
maximum of five:

.. code-block:: console

    $ condor_submit_dag -maxidle 5 diamond.dag

To run a DAG, limiting the number of concurrent PRE scripts to 10 and
the number of concurrent POST scripts to five:

.. code-block:: console

    $ condor_submit_dag -maxpre 10 -maxpost 5 diamond.dag

To run two DAGs, each of which is set up to run in its own directory:

.. code-block:: console

    $ condor_submit_dag -usedagdir dag1/diamond1.dag dag2/diamond2.dag