File: max_tasks_active.rst

package info (click to toggle)
mpire 2.10.2-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 2,064 kB
  • sloc: python: 5,473; makefile: 209; javascript: 182
file content (35 lines) | stat: -rw-r--r-- 1,798 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
.. _max_active_tasks:

Maximum number of active tasks
==============================

When you have tasks that take up a lot of memory you can do a few things:

- Limit the number of jobs (i.e., the number of tasks currently being available to the workers, tasks that are in the
  queue ready to be processed).
- Limit the number of active tasks

The first option is the most obvious one to save memory when the processes themselves use up much memory. The second is
convenient when the argument list takes up too much memory. For example, suppose you want to kick off an enormous amount
of jobs (let's say a billion) of which the arguments take up 1 KB per task (e.g., large strings), then that task queue
would take up ~1 TB of memory!

In such cases, a good rule of thumb would be to have twice the amount of active chunks of tasks than there are jobs.
This means that when all workers complete their task at the same time each would directly be able to continue with
another task. When workers take on their new tasks the generator of tasks is iterated to the point that again there
would be twice the amount of active chunks of tasks.

In MPIRE, the maximum number of active tasks by default is set to ``n_jobs * chunk_size * 2``, so you don't have to
tweak it for memory optimization. If, for whatever reason, you want to change this behavior, you can do so by setting
the ``max_active_tasks`` parameter:

.. code-block:: python

    with WorkerPool(n_jobs=4) as pool:
        results = pool.map(task, range(int(1e300)), iterable_len=int(1e300),
                           chunk_size=int(1e5), max_tasks_active=4 * int(1e5))

.. note::

    Setting the ``max_tasks_active`` parameter to a value lower than ``n_jobs * chunk_size`` can result in some workers
    not being able to do anything.