To run more than one piece of code at the same time on the same computer one has the choice of either using multiple processes or multiple threads.
Although a program can be made up of multiple processes, these processes are in effect completely independent of one another: different processes are not able to cooperate with one another unless one sets up some means of communication between them (such as by using sockets). If a lot of data must be transferred between processes then this can be inefficient.
On the other hand, multiple threads within a single process are intimately connected: they share their data but often can interfere badly with one another. It is often argued that the only way to make multithreaded programming "easy" is to avoid relying on any shared state and for the threads to only communicate by passing messages to each other.
CPython has a Global Interpreter Lock (GIL) which in many ways makes threading easier than it is in most languages by making sure that only one thread can manipulate the interpreter's objects at a time. As a result, it is often safe to let multiple threads access data without using any additional locking as one would need to in a language such as C.
One downside of the GIL is that on multi-processor (or multi-core) systems a multithreaded Python program can only make use of one processor at a time. This is a problem that can be overcome by using multiple processes instead.
Python gives little direct support for writing programs using multiple process. This package allows one to write multi-process programs using much the same API that one uses for writing threaded programs.
There are two ways of creating a new process in Python:
The current process can fork a new child process by using the os.fork() function. This effectively creates an identical copy of the current process which is now able to go off and perform some task set by the parent process. This means that the child process inherits copies of all variables that the parent process had.
However, os.fork() is not available on every platform: in particular Windows does not support it.
Alternatively, the current process can spawn a completely new Python interpreter by using the subprocess module or one of the os.spawn*() functions.
Getting this new interpreter in to a fit state to perform the task set for it by its parent process is, however, a bit of a challenge.
The processing package uses os.fork() if it is available since it makes life a lot simpler. Forking the process is also more efficient in terms of memory usage and the time needed to create the new process.
In the processing package processes are spawned by creating a Process object and then calling its start() method. processing.Process follows the API of threading.Thread. A trivial example of a multiprocess program is
from processing import Process def f(name): print 'hello', name if __name__ == '__main__': p = Process(target=f, args=('bob',)) p.start() p.join()
Here the function f is run in a child process.
For an explanation of why (on Windows) the if __name__ == '__main__' part is necessary see Programming guidelines.
processing supports two types of communication channel between processes:
The function Queue() returns a near clone of Queue.Queue -- see the Python standard documentation. For example
from processing import Process, Queue def f(q): q.put([42, None, 'hello']) if __name__ == '__main__': q = Queue() p = Process(target=f, args=(q,)) p.start() print q.get() # prints "[42, None, 'hello']" p.join()
Queues are thread and process safe. See Queues.
The Pipe() function returns a pair of connection objects connected by a pipe which by default is duplex (two-way). For example
from processing import Process, Pipe def f(conn): conn.send([42, None, 'hello']) conn.close() if __name__ == '__main__': parent_conn, child_conn = Pipe() p = Process(target=f, args=(child_conn,)) p.start() print parent_conn.recv() # prints "[42, None, 'hello']" p.join()
The two connection objects returned by Pipe() represent the two ends of the pipe. Each connection object has send() and recv() methods (among others). Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time. See Pipes.
processing contains equivalents of all the synchronization primitives from threading. For instance one can use a lock to ensure that only one process prints to standard output at a time:
from processing import Process, Lock def f(l, i): l.acquire() print 'hello world', i l.release() if __name__ == '__main__': lock = Lock() for num in range(10): Process(target=f, args=(lock, num)).start()
Without using the lock output from the different processes is liable to get all mixed up.
As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.
However, if you really do need to use some shared data then processing provides a couple of ways of doing so.
Data can be stored in a shared memory map using Value or Array. For example the following code
from processing import Process, Value, Array def f(n, a): n.value = 3.1415927 for i in range(len(a)): a[i] = -a[i] if __name__ == '__main__': num = Value('d', 0.0) arr = Array('i', range(10)) p = Process(target=f, args=(num, arr)) p.start() p.join() print num.value print arr[:]
will print
3.1415927 [0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
The 'd' and 'i' arguments used when creating num and arr are typecodes of the kind used by the array module: 'd' indicates a double precision float and 'i' inidicates a signed integer. These shared objects will be process and thread safe.
For more flexibility in using shared memory one can use the processing.sharedctypes module which supports the creation of arbitrary ctypes objects allocated from shared memory.
A manager object returned by Manager() controls a server process which holds python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array. For example:
from processing import Process, Manager def f(d, l): d[1] = '1' d['2'] = 2 d[0.25] = None l.reverse() if __name__ == '__main__': manager = Manager() d = manager.dict() l = manager.list(range(10)) p = Process(target=f, args=(d, l)) p.start() p.join() print d print l
will print
{0.25: None, 1: '1', '2': 2} [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Creating managers which support other types is not hard --- see Customized managers.
Server process managers are more flexible than using shared memory objects because they can be made to support arbitrary object types. Also, a single manager can be shared by processes on different computers over a network. They are, however, slower than using shared memory. See Server process managers.
The Pool() function returns an object representing a pool of worker processes. It has methods which allows tasks to be offloaded to the worker processes in a few different ways.
For example:
from processing import Pool def f(x): return x*x if __name__ == '__main__': pool = Pool(processes=4) # start 4 worker processes result = pool.applyAsync(f, [10]) # evaluate "f(10)" asynchronously print result.get(timeout=1) # prints "100" unless your computer is *very* slow print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
See Process pools.
The following benchmarks were performed on a single core Pentium 4, 2.5Ghz laptop running Windows XP and Ubuntu Linux 6.10 --- see benchmarks.py.
Number of 256 byte string objects passed between processes/threads per sec:
Connection type | Windows | Linux |
---|---|---|
Queue.Queue | 49,000 | 17,000-50,000 [1] |
processing.Queue | 22,000 | 21,000 |
Queue managed by server | 6,900 | 6,500 |
processing.Pipe | 52,000 | 57,000 |
[1] | For some reason the performance of Queue.Queue is very variable on Linux. |
Number of acquires/releases of a lock per sec:
Lock type | Windows | Linux |
---|---|---|
threading.Lock | 850,000 | 560,000 |
processing.Lock | 420,000 | 510,000 |
Lock managed by server | 10,000 | 8,400 |
threading.RLock | 93,000 | 76,000 |
processing.RLock | 420,000 | 500,000 |
RLock managed by server | 8,800 | 7,400 |
Number of interleaved waits/notifies per sec on a condition variable by two processes:
Condition type | Windows | Linux |
---|---|---|
threading.Condition | 27,000 | 31,000 |
processing.Condition | 26,000 | 25,000 |
Condition managed by server | 6,600 | 6,000 |
Number of integers retrieved from a sequence per sec:
Sequence type | Windows | Linux |
---|---|---|
list | 6,400,000 | 5,100,000 |
unsynchornized shared array | 3,900,000 | 3,100,000 |
synchronized shared array | 200,000 | 220,000 |
list managed by server | 20,000 | 17,000 |