File: concurrent.rst

package info (click to toggle)
python-parsl 2025.12.15%2Bds-1
  • links: PTS, VCS
  • area: main
  • in suites: forky
  • size: 12,144 kB
  • sloc: python: 24,403; makefile: 352; sh: 252; ansic: 45
file content (129 lines) | stat: -rw-r--r-- 4,852 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
.. _label-parslpool:

ParslPoolExecutor Interface
===========================

The :class:`parsl.concurrent.ParslPoolExecutor` implements the "Executor" interface
from the :mod:`concurrent.futures` module in standard Python.
This reduced interface permits employing Parsl into applications already capable
of single-node parallelism using :class:`~concurrent.futures.ProcessPoolExecutor`.

Creating a ParslPoolExecutor
----------------------------

.. note::

    See the :ref:`Configuration section <configuration-section>` for how to define computational resources.

Create a :class:`~parsl.concurrent.ParslPoolExecutor` using one of two methods:

1. Supplying a Parsl :class:`~parsl.Config` that defines how to create new workers.
   The executor will start a new Parsl Data Flow Kernel (DFK) when it is entered as a context manager.

   .. code-block:: python

       from parsl.concurrent import ParslPoolExecutor
       from parsl.config.htex_local import config  # Mimicks the PythonPool

       with ParslPoolExecutor(config=config) as pool:
           ...

   All resources will be closed when the block exits.


2. Supplying an already-started Parsl :class:`~parsl.DataFlowKernel` (DFK).
   The Parsl DFK must be started and stopped separate from the executor.

    .. code-block:: python

       from parsl.concurrent import ParslPoolExecutor
       from parsl.config.htex_local import config
       import parsl

       with parsl.load(dfk) as dfk
           with ParslPoolExecutor(dfk=dfk) as pool:
               ...
           ...

   Parsl will only shut when the outer block exits.

Use multiple types of resources within the same program
by creating multiple :class:`~parsl.concurrent.ParslPoolExecutors`,
each mapped to different types of Parsl workers (also called "executors").
Start by loading a Parsl Config that includes :ref:`multiple executors <config-multiple>`.
Then create ``ParslPoolExecutor`` with different lists of "executors" that
their tasks will use.

.. code-block:: python

    with parsl.load(hybrid_config) as dfk, \
            ParslPoolExecutor(dfk=dfk, executors=['gpu']) as pool_gpu, \
            ParslPoolExecutor(dfk=dfk, executors=['cpu']) as pool_cpu:
        ...

Using a ParslPoolExecutor
-------------------------

The :class:`~parsl.concurrent.ParslPoolExecutor` supports all functions from
the :class:`~concurrent.futures.Executor` interface *except task cancellation*.

The  ``submit`` and ``map`` functions behave just as in :class:`~concurrent.futures.ProcessPoolExecutor`,
and also include the task chaining supported in App-based Parsl workflows.

.. code-block:: python

    from parsl.concurrent import ParslPoolExecutor
    from parsl.config.htex_local import config  # Mimicks the PythonPool

    def f(x):
        return x + 1

    with ParslPoolExecutor(config=config) as pool:
        # Submit a task then a task which depends on the result
        future_1 = pool.submit(f, 1)
        future_2 = pool.submit(f, future_1)

        assert future_1.result() == 2
        assert future_2.result() == 3

Tasks from the Executor and App-based interface may also be used together.

    .. code-block:: python

        def f(x):
            return x + 1

        @python_app
        def parity(x):
            return 'odd' if x % 2 == 1 else 'even'

        with ParslPoolExecutor(config=my_parsl_config) as executor:
            future_1 = executor.submit(f, 1)
            assert parity(future_1).result() == 'even'  # Function chaining, as expected


Differences
-----------

The differences between the Parsl-based :class:`~parsl.concurrent.ParslPoolExecutor`
and the Python :class:`~concurrent.futures.ProcessPoolExecutor` are:

1. *Task Cancellation*: Parsl does not support canceling tasks once submitted.

2. *Defining Functions*: Functions defined in modules work the same in both
   Parsl and the ``ProcessPoolExecutor``. However, those defined during execution
   (e.g., in the "main" file) behave differently.

   Parsl will serialize functions defined at runtime but they will not be able
   to access global variables (as is the case when using the
   `spawn start method <https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods>`_
   in the ``ProcessPoolExecutor``), which means modules must be imported
   inside the function.
   Follow the rules :ref:`in the App guide <function-rules>`.

3. *No Multiprocessing Objects*: Tools such as the :class:`multiprocessing.Queue` and
   `synchronization primitives <https://docs.python.org/3/library/multiprocessing.html#synchronization-primitives>`_
   are not compatible with ``ParslPoolExecutor``.

4. *Worker Initialization*: The worker initialization functions from :class:`~concurrent.futures.ProcessPoolExecutor`
   are not supported. Configure workers using the ``Config`` object instead.