File: start_method.rst

package info (click to toggle)
mpire 2.10.2-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 2,064 kB
  • sloc: python: 5,473; makefile: 209; javascript: 182
file content (78 lines) | stat: -rw-r--r-- 2,837 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
.. _start_methods:

Process start method
====================

.. contents:: Contents
    :depth: 2
    :local:

The ``multiprocessing`` package allows you to start processes using a few different methods: ``'fork'``, ``'spawn'`` or
``'forkserver'``. Threading is also available by using ``'threading'``. For detailed information on the multiprocessing
contexts, please refer to the multiprocessing documentation_ and caveats_ section. In short:

fork
    Copies the parent process such that the child process is effectively identical. This includes copying everything
    currently in memory. This is sometimes useful, but other times useless or even a serious bottleneck. ``fork``
    enables the use of copy-on-write shared objects (see :ref:`shared_objects`).
spawn
    Starts a fresh python interpreter where only those resources necessary are inherited.
forkserver
    First starts a server process (using ``'spawn'``). Whenever a new process is needed the parent process requests the
    server to fork a new process.
threading
    Starts child threads. Suffers from the Global Interpreter Lock (GIL), but works fine for I/O intensive tasks.

For an overview of start method availability and defaults, please refer to the following table:

.. list-table::
    :header-rows: 1

    * - Start method
      - Available on Unix
      - Available on Windows
    * - ``fork``
      - Yes (default)
      - No
    * - ``spawn``
      - Yes
      - Yes (default)
    * - ``forkserver``
      - Yes
      - No
    * - ``threading``
      - Yes
      - Yes

Spawn and forkserver
--------------------

When using ``spawn`` or ``forkserver`` as start method, be aware that global variables (constants are fine) might have a
different value than you might expect. You also have to import packages within the called function:

.. code-block:: python

    import os

    def failing_job(folder, filename):
        return os.path.join(folder, filename)

    # This will fail because 'os' is not copied to the child processes
    with WorkerPool(n_jobs=2, start_method='spawn') as pool:
        pool.map(failing_job, [('folder', '0.p3'), ('folder', '1.p3')])

.. code-block:: python

    def working_job(folder, filename):
        import os
        return os.path.join(folder, filename)

    # This will work
    with WorkerPool(n_jobs=2, start_method='spawn') as pool:
        pool.map(working_job, [('folder', '0.p3'), ('folder', '1.p3')])

A lot of effort has been put into making the progress bar, dashboard, and nested pools (with multiple progress bars)
work well with ``spawn`` and ``forkserver``. So, everything should work fine.

.. _documentation: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
.. _caveats: https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods