File: communications.rst

package info (click to toggle)
dask.distributed 2022.12.1%2Bds.1-3
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 10,164 kB
  • sloc: python: 81,938; javascript: 1,549; makefile: 228; sh: 100
file content (118 lines) | stat: -rw-r--r-- 3,770 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
.. _communications:

==============
Communications
==============

Workers, the Scheduler, and Clients communicate by sending each other
Python objects (such as :ref:`protocol` messages or user data).
The communication layer handles appropriate encoding and shipping
of those Python objects between the distributed endpoints.  The
communication layer is able to select between different transport
implementations, depending on user choice or (possibly) internal
optimizations.

The communication layer lives in the :mod:`distributed.comm` package.


Addresses
=========

Communication addresses are canonically represented as URIs, such as
``tcp://127.0.0.1:1234``.  For compatibility with existing code, if the
URI scheme is omitted, a default scheme of ``tcp`` is assumed (so
``127.0.0.1:456`` is really the same as ``tcp://127.0.0.1:456``).
The default scheme may change in the future.

The following schemes are currently implemented in the ``distributed``
source tree:

* ``tcp`` is the main transport; it uses TCP sockets and allows for IPv4
  and IPv6 addresses.

* ``tls`` is a secure transport using the well-known `TLS protocol`_ over
  TCP sockets.  Using it requires specifying keys and
  certificates as outlined in :ref:`tls`.

* ``inproc`` is an in-process transport using simple object queues; it
  eliminates serialization and I/O overhead, providing almost zero-cost
  communication between endpoints as long as they are situated in the
  same process.

Some URIs may be valid for listening but not for connecting.
For example, the URI ``tcp://`` will listen on all IPv4 and IPv6 addresses
and on an arbitrary port, but you cannot connect to that address.

Higher-level APIs in ``distributed`` may accept other address formats for
convenience or compatibility, for example a ``(host, port)`` pair.  However,
the abstract communications layer always deals with URIs.


.. _TLS protocol: https://en.wikipedia.org/wiki/Transport_Layer_Security


Functions
---------

There are a number of top-level functions in :mod:`distributed.comm`
to help deal with addresses:

.. autofunction:: distributed.comm.parse_address

.. autofunction:: distributed.comm.unparse_address

.. autofunction:: distributed.comm.normalize_address

.. autofunction:: distributed.comm.resolve_address

.. autofunction:: distributed.comm.get_address_host


Communications API
==================

The basic unit for dealing with established communications is the ``Comm``
object:

.. autoclass:: distributed.comm.Comm
   :members:

You don't create ``Comm`` objects directly: you either ``listen`` for
incoming communications, or ``connect`` to a peer listening for connections:

.. autofunction:: distributed.comm.connect

.. autofunction:: distributed.comm.listen

Listener objects expose the following interface:

.. autoclass:: distributed.comm.core.Listener
   :members:


Extending the Communication Layer
=================================

Each transport is represented by a URI scheme (such as ``tcp``) and
backed by a dedicated :class:`Backend` implementation, which provides
entry points into all transport-specific routines.

Out-of-tree backends can be registered under the group ``distributed.comm.backends``
in setuptools `entry_points`_. For example, a hypothetical ``dask_udp`` package
would register its UDP backend class by including the following in its ``setup.py`` file:

.. code-block:: python

    setup(name="dask_udp",
          entry_points={
            "distributed.comm.backends": [
                "udp=dask_udp.backend:UDPBackend",
            ]
          },
          ...
    )

.. autoclass:: distributed.comm.registry.Backend
   :members:

.. _entry_points: https://packaging.python.org/guides/creating-and-discovering-plugins/#using-package-metadata