File: condor_qsub.rst

package info (click to toggle)
condor 23.9.6%2Bdfsg-2.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 60,012 kB
  • sloc: cpp: 528,272; perl: 87,066; python: 42,650; ansic: 29,558; sh: 11,271; javascript: 3,479; ada: 2,319; java: 619; makefile: 615; xml: 613; awk: 268; yacc: 78; fortran: 54; csh: 24
file content (252 lines) | stat: -rw-r--r-- 13,526 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
*condor_qsub*
==============

Queue jobs that use PBS/SGE-style submission
:index:`condor_qsub<single: condor_qsub; HTCondor commands>`\ :index:`condor_qsub command`

Synopsis
--------

**condor_qsub** [--**version**]

**condor_qsub** [**Specific options** ] [**Directory options** ]
[**Environmental options** ] [**File options** ] [**Notification
options** ] [**Resource options** ] [**Status options** ]
[**Submission options** ] *commandfile*

Description
-----------

*condor_qsub* submits an HTCondor job. This job is specified in a
PBS/Torque style or an SGE style. *condor_qsub* permits the submission
of dependent jobs without the need to specify the full dependency graph
at submission time. Doing things this way is neither as efficient as
HTCondor's DAGMan, nor as functional as SGE's *qsub* or *qalter*.
*condor_qsub* serves as a minimal translator to be able to use software
originally written to interact with PBS, Torque, and SGE in an HTCondor
pool.

*condor_qsub* attempts to behave like *qsub*. Less than half of the
*qsub* functionality is implemented. Option descriptions describe the
differences between the behavior of *qsub* and *condor_qsub*. *qsub*
options not listed here are not supported. Some concepts present in PBS
and SGE do not apply to HTCondor, and so these options are not
implemented.

For a full listing of *qsub* options, please see

 POSIX
    :
    `http://pubs.opengroup.org/onlinepubs/9699919799/utilities/qsub.html <http://pubs.opengroup.org/onlinepubs/9699919799/utilities/qsub.html>`_
 SGE
    :
    `http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html <http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html>`_
 PBS/Torque
    :
    `http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/qsub.htm <http://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/qsub.htm>`_

*condor_qsub* accepts either command line options or the single file,
*commandfile*, that contains all of the commands.

*condor_qsub* does the opposite of job submission within the **grid**
universe **batch** grid type, which takes HTCondor jobs submitted with
HTCondor syntax and submits them to PBS, SGE, or LSF.

Options
-------

 **-a** *date_time*
    (Submission option) Specify a deferred execution date and time. The
    PBS/Torque syntax of *date_time* is a string in the form
    *[[[[CC]YY]MM]DD]hhmm[.SS]*. The portions of this string which are
    optional are *CC*, *YY*, *MM*, *DD*, and *SS*. For SGE, *MM* and
    *DD* are not optional. For PBS, *MM* and *DD* are optional.
    *condor_qsub* follows the PBS style.
 **-A** *account_string*
    (Status option) Uses group accounting where the string
    *account_string* is the accounting group associated with this job.
    Unlike SGE, there is no default group of ``"sge"``.
 **-b** *y|n*
    (Submission option) Using the SGE definition of its *-b* option, a
    value of *y* causes *condor_qsub* to not parse the file for
    additional *condor_qsub* commands. The default value is *n*. If the
    command line argument **-f** *filename* is also specified, it
    negates a value of *y*.
 **-condor-keep-files**
    (Specific option) Directs HTCondor to not remove temporary files
    generated by *condor_qsub*, such as HTCondor submit files and
    sentinel jobs. These temporary files may be important for debugging.
 **-cwd**
    (Directory option) Specifies the initial directory in which the job
    will run to be the current directory from which the job was
    submitted. This sets
    :subcom:`initialdir[condor_qsub]` for
    *condor_submit*.
 **-d** *path* or **-wd** *path*
    (Directory option) Specifies the initial directory in which the job
    will run to be *path*. This sets
    :subcom:`initialdir[condor_qsub]` for
    *condor_submit*.
 **-e** *filename*
    (File option) Specifies the *condor_submit* command
    :subcom:`error[condor_qsub]`, the file where
    ``stderr`` is written. If not specified, set to the default name of
    ``  <commandfile>.e<ClusterId>``, where ``<commandfile>`` is the
    *condor_qsub* argument, and ``  <ClusterId>`` is the job attribute
    :ad-attr:`ClusterId` assigned for the job.
 **-f** *qsub_file*
    (Specific option) Parse *qsub_file* to search for and set
    additional *condor_submit* commands. Within the file, commands will
    appear as ``#PBS`` or ``#SGE``. *condor_qsub* will parse the batch
    file listed as *qsub_file*.
 **-h**
    (Status option) Placed submitted job directly into the hold state.
 **-help**
    (Specific option) Print usage information and exit.
 **-hold_jid** *<jid>*
    (Status option) Submits a job in the hold state. This job is
    released only when a previously submitted job, identified by its
    cluster ID as *<jid>*, exits successfully. Successful completion is
    defined as not exiting with exit code 100. In implementation, there
    are three jobs that define this SGE feature. The first job is the
    previously submitted job. The second job is the newly submitted one
    that is waiting for the first to finish successfully. The third job
    is what SGE calls a sentinel job; this is an HTCondor local universe
    job that watches the history for the first job's exit code. This
    third job will exit once it has seen the exit code and, for a
    successful termination of the first job, run *condor_release* on
    the second job. If the first job is an array job, the second job
    will only be released after all individual jobs of the first job
    have completed.
 **-i** *[hostname:]filename*
    (File option) Specifies the *condor_submit* command
    :subcom:`input[condor_qsub]`, the file from
    which ``stdin`` is read.
 **-j** *characters*
    (File option) Acceptable characters for this option are ``e``,
    ``o``, and ``n``. The only sequence that is relevant is ``eo``; it
    specifies that both standard output and standard error are to be
    sent to the same file. The file will be the one specified by the
    **-o** option, if both the **-o** and **-e** options exist. The file
    will be the one specified by the **-e** option, if only the **-e**
    option is provided. If neither the **-o** nor the **-e** options are
    provided, the file will be the default used for the **-o** option.
 **-l** *resource_spec*
    (Resource option) Specifies requirements for the job, such as the
    amount of RAM and the number of CPUs. Only PBS-style resource
    requests are supported. *resource_spec* is a comma separated list
    of key/value pairs. Each pair is of the form
    ``resource_name=value``. ``resource_name`` and ``value`` may be
    +--------------------------+--------------------------+--------------------------+
    | ``resource_name``        | ``value``                | Description              |
    +--------------------------+--------------------------+--------------------------+
    | arch                     | string                   | Sets :ad-attr:`Arch` machine    |
    |                          |                          | attribute. Enclose in    |
    |                          |                          | double quotes.           |
    +--------------------------+--------------------------+--------------------------+
    | file                     | size                     | Disk space requested.    |
    +--------------------------+--------------------------+--------------------------+
    | host                     | string                   | Host machine on which    |
    |                          |                          | the job must run.        |
    +--------------------------+--------------------------+--------------------------+
    | mem                      | size                     | Amount of memory         |
    |                          |                          | requested.               |
    +--------------------------+--------------------------+--------------------------+
    | nodes                    | ``{<node_count> | <hostn | Number and/or properties |
    |                          | ame>} [:ppn=<ppn>] [:gpu | of nodes to be used. For |
    |                          | s=<gpu>] [:<property> [: | examples, please see     |
    |                          | <property>] ...] [+ ...]``   | `http://docs.adaptivecom |
    |                          |                          | puting.com/torque/4-1-3/ |
    |                          |                          | Content/topics/2-jobs/re |
    |                          |                          | questingRes.htm#qsub <ht |
    |                          |                          | tp://docs.adaptivecomput |
    |                          |                          | ing.com/torque/4-1-3/Con |
    |                          |                          | tent/topics/2-jobs/reque |
    |                          |                          | stingRes.htm#qsub>`_    |
    +--------------------------+--------------------------+--------------------------+
    | opsys                    | string                   | Sets :ad-attr:`OpSys` machine   |
    |                          |                          | attribute. Enclose in    |
    |                          |                          | double quotes.           |
    +--------------------------+--------------------------+--------------------------+
    | procs                    | integer                  | Number of CPUs           |
    |                          |                          | requested.               |
    +--------------------------+--------------------------+--------------------------+

    A size value is an integer specified in bytes, following the
    PBS/Torque default. Append ``Kb``, ``Mb``, ``Gb``, or ``Tb`` to
    specify the value in powers of two quantities greater than bytes.
 **-m** *a|e|n*
    (Notification option) Identify when HTCondor sends notification
    e-mail. If *a*, send e-mail when the job terminates abnormally. If
    *e*, send e-mail when the job terminates. If *n*, never send e-mail.
 **-M** *e-mail_address*
    (Notification option) Sets the destination address for HTCondor
    e-mail.
 **-o** *filename*
    (File option) Specifies the *condor_submit* command
    :subcom:`output[condor_qsub]`, the file where
    ``stdout`` is written. If not specified, set to the default name of
    ``  <commandfile>.o<ClusterId>``, where ``<commandfile>`` is the
    *condor_qsub* argument, and ``  <ClusterId>`` is the job attribute
    :ad-attr:`ClusterId` assigned for the job.
 **-p** *integer*
    (Status option) Sets the
    :subcom:`priority[condor_qsub]` submit
    command for the job, with 0 being the default. Jobs with higher
    numerical priority will run before jobs with lower numerical
    priority.
 **-print**
    (Specific option) Send to ``stdout`` the contents of the HTCondor
    submit description file that *condor_qsub* generates.
 **-r** *y|n*
    (Status option) The default value of *y* implements the default
    HTCondor policy of assuming that jobs that do not complete are
    placed back in the queue to be run again. When *n*, job submission
    is restricted to only running the job if the job ClassAd attribute
    :ad-attr:`NumJobStarts` is currently 0. This identifies the job as not
    re-runnable, limiting it to start once.
 **-S** *shell*
    (Submission option) Specifies the path and executable name of a
    shell. Alters the HTCondor submit description file produced, such
    that the executable becomes a wrapper script. Within the submit
    description file will be ``executable = <shell>`` and
    ``arguments = <commandfile>``.
 **-t** *start [-stop:step]*
    (Submission option) Queues a set of nearly identical jobs. The
    SGE-style syntax is supported. *start*, *stop*, and *step* are all
    integers. *start* is the starting index of the jobs, *stop* is the
    ending index (inclusive) of the jobs, and *step* is the step size
    through the indices. Note that using more than one processor or node
    in a job will not work with this option.
 **-test**
    (Specific option) With the intention of testing a potential job
    submission, parse files and commands to generate error output.
    Produces, but then removes the HTCondor submit description file.
    Never submits the job, even if no errors are encountered.
 **-v** *variable list*
    (Environmental option) Used to set the submit command
    :subcom:`environment[condor_qsub]` for
    the job. *variable list* is as that defined for the submit command.
    Note that the syntax needed is specialized to deal with quote marks
    and white space characters.
 **-V**
    (Environmental option) Sets ``getenv = True`` in the submit
    description file.
 **-W** *attr_name=attr_value[,attr_name=attr_value...]*
    (File option) PBS/Torque supports a number of attributes. However,
    *condor_qsub* only supports the names *stagein* and *stageout* for
    *attr_name*. The format of *attr_value* for *stagein* and
    *stageout* is ``local_file@hostname:remote_file[,...]`` and we strip
    it to ``remote_file[,...]``. HTCondor's file transfer mechanism is
    then used if needed.
 **-version**
    (Specific option) Print version information for the *condor_qsub*
    program and exit. Note that *condor_qsub* has its own version
    numbers which are separate from those of HTCondor.

Exit Status
-----------

*condor_qsub* will exit with a status value of 0 (zero) upon success,
and it will exit with the value 1 (one) upon failure to submit a job.