File: condor_annex.rst

package info (click to toggle)
condor 23.9.6%2Bdfsg-2.1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 60,012 kB
  • sloc: cpp: 528,272; perl: 87,066; python: 42,650; ansic: 29,558; sh: 11,271; javascript: 3,479; ada: 2,319; java: 619; makefile: 615; xml: 613; awk: 268; yacc: 78; fortran: 54; csh: 24
file content (206 lines) | stat: -rw-r--r-- 8,549 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
*condor_annex*
===============

Add cloud resources to the pool.
:index:`condor_annex<single: condor_annex; HTCondor commands>`\ :index:`condor_annex command`

Synopsis
--------

**condor_annex** **-help**

**condor_annex** [**-aws-region** *<region>*] **-setup** *[FROM
INSTANCE|[/full/path/to/access/key/file
[/full/path/to/secret/key/file]]]*

**condor_annex** [**-aws-on-demand** ] **-annex-name** *<name of the
annex>* **-count** *<integer number of instances>*
[**-aws-on-demand-\*** ] [**common options** ]

**condor_annex** [**-aws-spot-fleet** ] **-annex-name** *<name of
the annex>* **-slots** *<integer weight>* [**-aws-spot-fleet-\*** ]
[**common options** ]

**condor_annex** **-annex-name** *<name of the annex>*
**-duration** *hours*

**condor_annex** [**-annex-name** *<name of the annex>*] **-status**
[**-classad** ]

**condor_annex** **-check-setup**

**condor_annex** <condor_annex options> **status** <condor_status
options>

Description
-----------

*condor_annex* adds clouds resources to the pool. ("The pool" is
determined in the usual manner for HTCondor daemons and tools.) This
version supports only Amazon Web Services ('AWS'). To add "on-demand"
instances, use the third form listed above; to add "spot" instances, use
the fourth. For an explanation of terms, consult either the HTCondor
manual in the :doc:`/cloud-computing/index` chapter or
the AWS documentation.

Using *condor_annex* with AWS requires a one-time setup procedure
performed by invoking *condor_annex* with the **-setup** flag (the
second form listed above). You may check if this procedure has been
performed with the **-check-setup** flag (the seventh form listed
above). If you use the setup flag on an instance whose role gives it
sufficient privileges, you may, instead of specifying your API keys,
pass ``FROM INSTANCE`` to **-setup** to ask *condor_annex* to use the
instance's role credentials.

To reset the lease on an existing annex, invoke *condor_annex* with
only the **-annex-name** option and **-duration** flag (the fifth form
listed above).

To determine which of the instances previously requested for a
particular annex are not currently in the pool, invoke *condor_annex*
with the **-status** flag and the **-annex-name** option (the sixth form
listed above). The output of this command is intended to be
human-readable; specifying the **-classad** flag will produce the same
information in ClassAd format. If you omit **-annex-name**, information
for all annexes will be returned.

Starting in 8.7.3, you may instead invoke *condor_annex* with
**status** as a command argument (the eighth form listed above). This
will cause *condor_annex* to use *condor_status* to present annex
instance data. Arguments and options on the command line after
**status** will be passed unmodified to *condor_status*, but not all
arguments and options will behave as expected. (See below.)
*condor_annex* will construct an ad for each annex instance and pass
that information to *condor_status*; *condor_status* will (unless you
specify otherwise using its command line) query the collector for more
information about the instances. Information from the collector will be
presented as usual; instances which did not have ads in the collector
will be presented last, in their own table. These instances can not be
presented in the usual way because the annex instance ads generated by
*condor_annex* do not (and can not) have the same information in them
as ads generated by a *condor_startd* running in the instance. See the
:doc:`/man-pages/condor_status` manual page for details about the "merge" mode
of *condor_status* used by this command argument. Note that both *condor_annex*
and *condor_status* have **-annex-name** options; if you're interested in a
particular annex, put this flag on the command line before the **status**
command argument to avoid confusing results.

Common options are listed first, followed by options specific to AWS,
followed by options specific to AWS' on-demand instances, followed by
options specific to AWS' spot instances, followed by options intended
for use by experts.

Options
-------

 **-help**
    Print a usage reminder.
 **-setup** *[/full/path/to/access/key/file/full/path/to/secret/key/file]*
    Do the first-time setup.
 **-duration** *hours*
    Set the maximum lease duration in decimal *hours*. After this amount
    of time, all instances will terminated, regardless of their
    idleness. Defaults to 50 minutes.
 **-idle** *hours*
    Set the maximum idle duration in decimal *hours*. An instance idle
    for longer than this duration will terminate itself. Defaults to 15
    minutes.
 **-yes**
    Start the annex automatically without a yes/no confirmation prompt.
 **-tag** *name* *value*
    Add a tag named *name* with value *value* to each instance in the
    requested annex.  Only works at annex creation.  This option may be
    specified more than once.
 **-config-dir** */full/path/to/directory*
    Copy the contents of */full/path/to/directory* to each instance's
    configuration directory.
 **-owner** *owner[, owner]\**
    Configure the annex so that only *owner* may start jobs there. By
    default, configure the annex so that only the user running
    *condor_annex* may start jobs there.
 **-no-owner**
    Configure the annex so that anyone in the pool may use the annex.
 **-aws-region** *region*
    Specify the region in which to create the annex.
 **-aws-user-data** *user-data*
    Set the instance user data to *user-data*.
 **-aws-user-data-file** */full/path/to/file*
    Set the instance user data to the contents of the file
    */full/path/to/file*.
 **-aws-default-user-data** *user-data*
    Set the instance user data to *user-data*, if it's not already set.
    Only applies to spot fleet requests.
 **-aws-default-user-data-file** */full/path/to/file*
    Set the instance user data to the contents of the file
    */full/path/to/file*, if it's not already set. Only applies to spot
    fleet requests.
 **-aws-on-demand-instance-type** *instance-type*
    This annex will requests instances of type *instance-type*. The
    default for v8.7.1 is 'm4.large'.
 **-aws-on-demand-ami-id** *ami-id*
    This annex will start instances of the AMI *ami-id*. The default for
    v8.7.1 is 'ami-35b13223', a GPU-compatible Amazon Linux image with
    HTCondor pre-installed.
 **-aws-on-demand-security-group-ids** *group-id[,group-id]*
    This annex will start instances with the listed security group IDs.
    The default is the security group created by **-setup**.
 **-aws-on-demand-key-name** *key-name*
    This annex will start instances with the key pair named *key-name*.
    The default is the key pair created by **-setup**.
 **-aws-spot-fleet-config-file** */full/path/to/file*
    Use the JSON blob in */full/path/to/file* for the spot fleet
    request.
 **-aws-access-key-file** */full/path/to/access-key-file*
    Experts only.
 **-aws-secret-key-file** */full/path/to/secret-key-file*
    Experts only.
 **-aws-ec2-url** *https://ec2.<region>.amazonaws.com*
    Experts only.
 **-aws-events-url** *https://events.<region>.amazonaws.com*
    Experts only.
 **-aws-lambda-url** *https://lambda.<region>.amazonaws.com*
    Experts only.
 **-aws-s3-url** *https://s3.<region>.amazonaws.com*
    Experts only.
 **-aws-spot-fleet-lease-function-arn** *sfr-lease-function-arn*
    Developers only.
 **-aws-on-demand-lease-function-arn** *odi-lease-function-arn*
    Developers only.
 **-aws-on-demand-instance-profile-arn** *instance-profile-arn*
    Developers only.

General Remarks
---------------

Currently, only AWS is supported. The AMI configured by setup runs
HTCondor v8.6.10 on Amazon Linux 2016.09, and the default instance type
is "m4.large". The default AMI has the appropriate drivers for AWS' GPU
instance types.

Examples
--------

To start an on-demand annex named 'MyFirstAnnex' with one core, using
the default AMI and instance type, run

.. code-block:: console

      $ condor_annex -count 1 -annex-name MyFirstAnnex

You will be asked to confirm that the defaults are what you want.

As of 2017-04-17, the following example will cost a minimum of $90.

To start an on-demand annex with 100 GPUs that job owners 'big' and
'little' may use (be sure to include yourself!), run

.. code-block:: console

      $ condor_annex -count 100 -annex-name MySecondAnnex \
        -aws-on-demand-instance-type p2.xlarge -owner "big, little"

Exit Status
-----------

*condor_annex* will exit with a status value of 0 (zero) on success.