File: python-advanced.rst

package info (click to toggle)
dask 1.0.0%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 6,856 kB
  • sloc: python: 51,266; sh: 178; makefile: 142
file content (62 lines) | stat: -rw-r--r-- 1,755 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
Python API (advanced)
=====================

In some rare cases, experts may want to create ``Scheduler`` and ``Worker``
objects explicitly in Python manually.  This is often necessary when making
tools to automatically deploy Dask in custom settings.

However, often it is sufficient to rely on the :doc:`Dask command line interface
<cli>`.

Scheduler
---------

To start the Scheduler, provide the listening port (defaults to 8786) and Tornado
IOLoop (defaults to ``IOLoop.current()``)

.. code-block:: python

   from distributed import Scheduler
   from tornado.ioloop import IOLoop
   from threading import Thread

   s = Scheduler()
   s.start('tcp://:8786')   # Listen on TCP port 8786

   loop = IOLoop.current()
   loop.start()

Alternatively, you may want the IOLoop and scheduler to run in a separate
thread.  In this case, you would replace the ``loop.start()`` call with the
following:

.. code-block:: python

   t = Thread(target=loop.start, daemon=True)
   t.start()


Worker
------

On other nodes, start worker processes that point to the URL of the scheduler.

.. code-block:: python

   from distributed import Worker
   from tornado.ioloop import IOLoop
   from threading import Thread

   w = Worker('tcp://127.0.0.1:8786')
   w.start()  # choose randomly assigned port

   loop = IOLoop.current()
   loop.start()

Alternatively, replace ``Worker`` with ``Nanny`` if you want your workers to be
managed in a separate process by a local nanny process.  This allows workers to
restart themselves in case of failure. Also, it provides some additional monitoring, 
and is useful when coordinating many workers that should live in different
processes in order to avoid the GIL_.

.. _GIL: https://docs.python.org/3/glossary.html#term-gil