File: condor_shadow-exit-codes.rst

package info (click to toggle)
condor 23.9.6%2Bdfsg-2.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 60,012 kB
  • sloc: cpp: 528,272; perl: 87,066; python: 42,650; ansic: 29,558; sh: 11,271; javascript: 3,479; ada: 2,319; java: 619; makefile: 615; xml: 613; awk: 268; yacc: 78; fortran: 54; csh: 24
file content (71 lines) | stat: -rw-r--r-- 5,962 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
*condor_shadow* Exit Codes
===========================

:index:`of condor_shadow<single: of condor_shadow; exit codes>`

When a *condor_shadow* daemon exits, the *condor_shadow* exit code is
recorded in the *condor_schedd* log, and it identifies why the job
exited. Prose in the log appears of the form

.. code-block:: text

    Shadow pid XXXXX for job XX.X exited with status YYY

where ``YYY`` is the exit code, or

.. code-block:: text

    Shadow pid XXXXX for job XX.X reports job exit reason 100.

where the exit code is the value 100. The following table lists these codes:

+---------+------------------------------------+--------------------------------------------------------------+
| Value   | Error Name                         | Description                                                  |
+---------+------------------------------------+--------------------------------------------------------------+
| 4       | JOB_EXCEPTION                      | the job exited with an exception                             |
+---------+------------------------------------+--------------------------------------------------------------+
| 44      | DPRINTF_ERROR                      | there was a fatal error with dprintf()                       |
+---------+------------------------------------+--------------------------------------------------------------+
| 100     | JOB_EXITED                         | the job exited (not killed)                                  |
+---------+------------------------------------+--------------------------------------------------------------+
| 101     | JOB_CKPTED                         | no longer used                                               |
+---------+------------------------------------+--------------------------------------------------------------+
| 102     | JOB_KILLED                         | the job was killed                                           |
+---------+------------------------------------+--------------------------------------------------------------+
| 103     | JOB_COREDUMPED                     | the job was killed and a core file was produced              |
+---------+------------------------------------+--------------------------------------------------------------+
| 105     | JOB_NO_MEM                         | not enough memory to start the condor_shadow                 |
+---------+------------------------------------+--------------------------------------------------------------+
| 106     | JOB_SHADOW_USAGE                   | incorrect arguments to condor_shadow                         |
+---------+------------------------------------+--------------------------------------------------------------+
| 107     | JOB_NOT_CKPTED                     | no longer used                                               |
+---------+------------------------------------+--------------------------------------------------------------+
| 107     | JOB_SHOULD_REQUEUE                 | same number as JOB_NOT_CKPTED,                               |
+         |                                    | to achieve the same behavior.                                |
|         |                                    | This exit code implies that we want                          |
|         |                                    | the job to be put back in the job queue                      |
|         |                                    | and run again.                                               |
+---------+------------------------------------+--------------------------------------------------------------+
| 108     | JOB_NOT_STARTED                    | can not connect to the *condor_startd* or request refused    |
+---------+------------------------------------+--------------------------------------------------------------+
| 109     | JOB_BAD_STATUS                     | job status != RUNNING on start up                            |
+---------+------------------------------------+--------------------------------------------------------------+
| 110     | JOB_EXEC_FAILED                    | exec failed for some reason other than ENOMEM                |
+---------+------------------------------------+--------------------------------------------------------------+
| 111     | JOB_NO_CKPT_FILE                   | no longer used                                               |
+---------+------------------------------------+--------------------------------------------------------------+
| 112     | JOB_SHOULD_HOLD                    | the job should be put on hold                                |
+---------+------------------------------------+--------------------------------------------------------------+
| 113     | JOB_SHOULD_REMOVE                  | the job should be removed                                    |
+---------+------------------------------------+--------------------------------------------------------------+
| 114     | JOB_MISSED_DEFERRAL_TIME           | the job goes on hold, because it did not run within the      |
|         |                                    | specified window of time                                     |
+---------+------------------------------------+--------------------------------------------------------------+
| 115     | JOB_EXITED_AND_CLAIM_CLOSING       | the job exited (not killed) but the *condor_startd*          |
|         |                                    | is not accepting any more jobs on this claim                 |
+---------+------------------------------------+--------------------------------------------------------------+
| 116     | JOB_RECONNECT_FAILED               | the *condor_shadow* was started in reconnect mode, and yet   |
|         |                                    | failed to reconnect to the starter                           |
+---------+------------------------------------+--------------------------------------------------------------+