File: condor_wait.rst

package info (click to toggle)
condor 23.9.6%2Bdfsg-2.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 60,012 kB
  • sloc: cpp: 528,272; perl: 87,066; python: 42,650; ansic: 29,558; sh: 11,271; javascript: 3,479; ada: 2,319; java: 619; makefile: 615; xml: 613; awk: 268; yacc: 78; fortran: 54; csh: 24
file content (116 lines) | stat: -rw-r--r-- 3,402 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
      

*condor_wait*
==============

Wait for jobs to finish :index:`condor_wait<single: condor_wait; HTCondor commands>`
:index:`condor_wait command`

Synopsis
--------

**condor_wait** [**-help | -version** ]

**condor_wait** [**-debug** ] [**-status** ] [**-echo** ]
[**-wait** *seconds*] [**-num** *number-of-jobs*] *log-file*
[**job ID** ]

Description
-----------

*condor_wait* watches a job event log file (created with the **log**
command within a submit description file) and returns when one or more
jobs from the log have completed or aborted.

Because *condor_wait* expects to find at least one job submitted event
in the log file, at least one job must have been successfully submitted
with *condor_submit* before *condor_wait* is executed.

*condor_wait* will wait forever for jobs to finish, unless a shorter
wait time is specified.

Options
-------

 **-help**
    Display usage information
 **-version**
    Display version information
 **-debug**
    Show extra debugging information.
 **-status**
    Show job start and terminate information.
 **-echo**
    Print the events out to ``stdout``.
 **-wait** *seconds*
    Wait no more than the integer number of *seconds*. The default is
    unlimited time.
 **-num** *number-of-jobs*
    Wait for the integer *number-of-jobs* jobs to end. The default is
    all jobs in the log file.
 log file
    The name of the log file to watch for information about the job.
 job ID
    A specific job or set of jobs to watch.
    :index:`use in<single: use in; job ID>`\ If the **job ID** is only the job
    ClassAd attribute :ad-attr:`ClusterId`, then *condor_wait* waits for all
    jobs with the given :ad-attr:`ClusterId`. If the **job ID** is a pair of
    the job ClassAd attributes, given by :ad-attr:`ClusterId`.\ :ad-attr:`ProcId`,
    then *condor_wait* waits for the specific job with this **job ID**.
    If this option is not specified, all jobs that exist in the log file
    when *condor_wait* is invoked will be watched.

General Remarks
---------------

*condor_wait* is an inexpensive way to test or wait for the completion
of a job or a whole cluster, if you are trying to get a process outside
of HTCondor to synchronize with a job or set of jobs.

It can also be used to wait for the completion of a limited subset of
jobs, via the **-num** option.

Examples
--------

.. code-block:: console

    $ condor_wait logfile

This command waits for all jobs that exist in ``logfile`` to complete.

.. code-block:: console

    $ condor_wait logfile 40

This command waits for all jobs that exist in ``logfile`` with a job
ClassAd attribute :ad-attr:`ClusterId` of 40 to complete.

.. code-block:: console

    $ condor_wait -num 2 logfile

This command waits for any two jobs that exist in ``logfile`` to
complete.

.. code-block:: console

    $ condor_wait logfile 40.1

This command waits for job 40.1 that exists in ``logfile`` to complete.

.. code-block:: console

    $ condor_wait -wait 3600 logfile 40.1

This waits for job 40.1 to complete by watching ``logfile``, but it will
not wait more than one hour (3600 seconds).

Exit Status
-----------

*condor_wait* exits with 0 if and only if the specified job or jobs
have completed or aborted. *condor_wait* returns 1 if unrecoverable
errors occur, such as a missing log file, if the job does not exist in
the log file, or the user-specified waiting time has expired.