File: deploying-cli.rst

package info (click to toggle)
dask 2024.12.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 20,024 kB
  • sloc: python: 105,182; javascript: 1,917; makefile: 159; sh: 88
file content (103 lines) | stat: -rw-r--r-- 3,620 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
.. _deploying-cli:

Command Line
============

This is the most fundamental way to deploy Dask on multiple machines.  In
production environments this process is often automated by some other resource
manager. Hence, it is rare that people need to follow these instructions
explicitly.  Instead, these instructions are useful to help understand what
*cluster managers* and other automated tooling is doing under the hood and to
help users deploy onto platforms that have no automated tools today.

A ``dask.distributed`` network consists of one ``dask scheduler`` process and
several ``dask worker`` processes that connect to that scheduler.  These are
normal Python processes that can be executed from the command line.  We launch
the ``dask scheduler`` executable in one process and the ``dask worker``
executable in several processes, possibly on different machines.

To accomplish this, launch ``dask scheduler`` on one node::

   $ dask scheduler
   Scheduler at:   tcp://192.0.0.100:8786

Then, launch ``dask worker`` on the rest of the nodes, providing the address to
the node that hosts ``dask scheduler``::

   $ dask worker tcp://192.0.0.100:8786
   Start worker at:  tcp://192.0.0.1:12345
   Registered to:    tcp://192.0.0.100:8786

   $ dask worker tcp://192.0.0.100:8786
   Start worker at:  tcp://192.0.0.2:40483
   Registered to:    tcp://192.0.0.100:8786

   $ dask worker tcp://192.0.0.100:8786
   Start worker at:  tcp://192.0.0.3:27372
   Registered to:    tcp://192.0.0.100:8786

The workers connect to the scheduler, which then sets up a long-running network
connection back to the worker.  The workers will learn the location of other
workers from the scheduler.


Handling Ports
--------------

The scheduler and workers both need to accept TCP connections on an open port.
By default, the scheduler binds to port ``8786`` and the worker binds to a
random open port.  If you are behind a firewall then you may have to open
particular ports or tell Dask to listen on particular ports with the ``--port``
and ``--worker-port`` keywords.::

   dask scheduler --port 8000
   dask worker --dashboard-address 8000 --nanny-port 8001


Nanny Processes
---------------

Dask workers are run within a nanny process that monitors the worker process
and restarts it if necessary.


Diagnostic Web Servers
----------------------

Additionally, Dask schedulers and workers host interactive diagnostic web
servers using `Bokeh <https://docs.bokeh.org>`_.  These are optional, but
generally useful to users.  The diagnostic server on the scheduler is
particularly valuable, and is served on port ``8787`` by default (configurable
with the ``--dashboard-address`` keyword).

For more information about relevant ports, please take a look at the available
:ref:`command line options <worker-scheduler-cli-options>`.

Automated Tools
---------------

There are various mechanisms to deploy these executables on a cluster, ranging
from manually SSH-ing into all of the machines to more automated systems like
SGE/SLURM/Torque or Yarn/Mesos.  Additionally, cluster SSH tools exist to send
the same commands to many machines.  We recommend searching online for "cluster
ssh" or "cssh".


.. _worker-scheduler-cli-options:

CLI Options
-----------

.. note::

   The command line documentation here may differ depending on your installed
   version. We recommend referring to the output of ``dask scheduler --help``
   and ``dask worker --help``.

.. click:: distributed.cli.dask_scheduler:main
   :prog: dask scheduler
   :show-nested:

.. click:: distributed.cli.dask_worker:main
   :prog: dask worker
   :show-nested: