File: clusterUtils.rst

package info (click to toggle)
toil 9.1.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 13,908 kB
  • sloc: python: 58,029; makefile: 313; sh: 168
file content (213 lines) | stat: -rw-r--r-- 9,919 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
.. _clusterUtils:

Toil Cluster Utilities
----------------------

In addition to the generic :ref:`utils`, there are several utilities used for starting and managing a Toil cluster using the AWS or GCE provisioners. They are installed
via the ``[aws]`` or ``[google]`` extra. For installation details see :ref:`installProvisioner`.

The ``toil`` cluster subcommands are:

    ``destroy-cluster`` --- For autoscaling.  Terminates the specified cluster and associated resources.

    ``launch-cluster`` --- For autoscaling.  This is used to launch a toil leader instance with the specified provisioner.

    ``rsync-cluster`` --- For autoscaling.  Used to transfer files to a cluster launched with ``toil launch-cluster``.

    ``ssh-cluster`` --- SSHs into the toil appliance container running on the leader of the cluster.

For information on a specific utility, run it with the ``--help`` option::

    toil launch-cluster --help

The cluster utilities can be used for :ref:`runningGCE` and :ref:`runningAWS`.

.. tip::

   By default, all of the cluster utilities expect to be running on AWS. To run with Google
   you will need to specify the ``--provisioner gce`` option for each utility.

.. note::

   Boto must be `configured`_ with AWS credentials before using cluster utilities.

   :ref:`runningGCE` contains instructions for

.. _configured: http://boto3.readthedocs.io/en/latest/guide/quickstart.html#configuration

.. _launchCluster:

Launch-Cluster Command
~~~~~~~~~~~~~~~~~~~~~~

Running ``toil launch-cluster`` starts up a leader for a cluster. Workers can be
added to the initial cluster by specifying the ``-w`` option.  An example would be ::

    $ toil launch-cluster my-cluster \
          --leaderNodeType t2.small -z us-west-2a \
          --keyPairName your-AWS-key-pair-name \
          --nodeTypes m3.large,t2.micro -w 1,4

Options are listed below.  These can also be displayed by running ::

    $ toil launch-cluster --help

launch-cluster's main positional argument is the clusterName.  This is simply the name of your cluster.  If it does not
exist yet, Toil will create it for you.

**Launch-Cluster Options**

  --help                -h also accepted.  Displays this help menu.
  --tempDirRoot TEMPDIRROOT
                        Path to the temporary directory where all temp
                        files are created, by default uses the current working
                        directory as the base.
  --version             Display version.
  --provisioner CLOUDPROVIDER
                        -p CLOUDPROVIDER also accepted.  The provisioner for
                        cluster auto-scaling.  Both AWS and GCE are
                        currently supported.
  --zone ZONE           -z ZONE also accepted.  The availability zone of the leader. This
                        parameter can also be set via the TOIL_AWS_ZONE or TOIL_GCE_ZONE
                        environment variables, or by the ec2_region_name
                        parameter in your .boto file if using AWS, or derived from the
                        instance metadata if using this utility on an existing
                        EC2 instance.
  --leaderNodeType LEADERNODETYPE
                        Non-preemptable node type to use for the cluster
                        leader.
  --keyPairName KEYPAIRNAME
                        The name of the AWS or ssh key pair to include on the
                        instance.
  --owner OWNER
                        The owner tag for all instances. If not given, the value in
                        TOIL_OWNER_TAG will be used, or else the value of
                        ``--keyPairName``.
  --boto BOTOPATH       The path to the boto credentials directory. This is
                        transferred to all nodes in order to access the AWS
                        jobStore from non-AWS instances.
  --tag KEYVALUE
                        KEYVALUE is specified as KEY=VALUE. -t KEY=VALUE also
                        accepted.  Tags are added to the AWS cluster for this
                        node and all of its children.
                        Tags are of the form: ``-t key1=value1`` ``--tag key2=value2``.
                        Multiple tags are allowed and each tag needs its own
                        flag. By default the cluster is tagged with:
                        { "Name": clusterName, "Owner": IAM username }.
  --vpcSubnet VPCSUBNET
                        VPC subnet ID to launch cluster leader in. Uses default
                        subnet if not specified. This subnet needs to have auto
                        assign IPs turned on.
  --nodeTypes NODETYPES
                        Comma-separated list of node types to create while
                        launching the leader. The syntax for each node type
                        depends on the provisioner used. For the AWS
                        provisioner this is the name of an EC2 instance type
                        followed by a colon and the price in dollars to bid for
                        a spot instance, for example 'c3.8xlarge:0.42'. Must
                        also provide the ``--workers`` argument to specify how
                        many workers of each node type to create.
  --workers WORKERS
                        -w WORKERS also accepted.  Comma-separated list of the
                        number of workers of each node type to launch alongside
                        the leader when the cluster is created. This can be
                        useful if running toil without auto-scaling but with
                        need of more hardware support.
  --leaderStorage LEADERSTORAGE
                        Specify the size (in gigabytes) of the root volume for
                        the leader instance. This is an EBS volume.
  --nodeStorage NODESTORAGE
                        Specify the size (in gigabytes) of the root volume for
                        any worker instances created when using the -w flag.
                        This is an EBS volume.
  --nodeStorageOverrides NODESTORAGEOVERRIDES
                        Comma-separated list of nodeType:nodeStorage that are used
                        to override the default value from ``--nodeStorage`` for the
                        specified nodeType(s). This is useful for heterogeneous jobs
                        where some tasks require much more disk than others.
  --allowFuse BOOL
                        Whether to allow FUSE mounts for faster runtimes with Singularity.
                        Note: This will result in the Toil container running as privileged.
                        For Kubernetes, pods will be asked to run as privileged. If this is not
                        allowed, Singularity containers will use sandbox directories instead.

**Logging Options**

  --logOff              Same as ``--logCritical``.
  --logCritical         Turn on logging at level CRITICAL and above. (default
                        is INFO)
  --logError            Turn on logging at level ERROR and above. (default is
                        INFO)
  --logWarning          Turn on logging at level WARNING and above. (default
                        is INFO)
  --logInfo             Turn on logging at level INFO and above. (default is
                        INFO)
  --logDebug            Turn on logging at level DEBUG and above. (default is
                        INFO)
  --logDebug            Turn on logging at level TRACE and above. (default is
                        INFO)
  --logLevel LOGLEVEL   Log at given level (may be either OFF (or CRITICAL),
                        ERROR, WARN (or WARNING), INFO, DEBUG, or TRACE).
                        (default is INFO)
  --logFile LOGFILE     File to log in.
  --rotatingLogging     Turn on rotating logging, which prevents log files
                        getting too big.

.. _sshCluster:

Ssh-Cluster Command
~~~~~~~~~~~~~~~~~~~

Toil provides the ability to ssh into the leader of the cluster. This
can be done as follows::

    $ toil ssh-cluster CLUSTER-NAME-HERE

This will open a shell on the Toil leader and is used to start an
:ref:`Autoscaling` run. Issues with docker prevent using ``screen`` and ``tmux``
when sshing the cluster (The shell doesn't know that it is a TTY which prevents
it from allocating a new screen session). This can be worked around via ::

    $ script
    $ screen

Simply running ``screen`` within ``script`` will get things working properly again.

Finally, you can execute remote commands with the following syntax::

    $ toil ssh-cluster CLUSTER-NAME-HERE remoteCommand

It is not advised that you run your Toil workflow using remote execution like this
unless a tool like `nohup <https://linux.die.net/man/1/nohup>`_ is used to ensure the
process does not die if the SSH connection is interrupted.

For an example usage, see :ref:`Autoscaling`.

.. _rsyncCluster:

Rsync-Cluster Command
~~~~~~~~~~~~~~~~~~~~~

The most frequent use case for the ``rsync-cluster`` utility is deploying your
workflow code to the Toil leader. Note that the syntax is the same as traditional
`rsync <https://linux.die.net/man/1/rsync>`_ with the exception of the hostname before
the colon. This is not needed in ``toil rsync-cluster`` since the hostname is automatically
determined by Toil.

Here is an example of its usage::

    $ toil rsync-cluster CLUSTER-NAME-HERE \
       ~/localFile :/remoteDestination

.. _destroyCluster:

Destroy-Cluster Command
~~~~~~~~~~~~~~~~~~~~~~~

The ``destroy-cluster`` command is the advised way to get rid of any Toil cluster
launched using the :ref:`launchCluster` command. It ensures that all attached nodes, volumes,
security groups, etc. are deleted. If a node or cluster is shut down using Amazon's online portal
residual resources may still be in use in the background. To delete a cluster run ::

    $ toil destroy-cluster CLUSTER-NAME-HERE