Package Scientific :: Package DistributedComputing :: Module MasterSlave
[frames] | no frames]

Module MasterSlave

Distributed computing using a master-slave model

The classes in this module provide a simple way to parallelize independent computations in a program. The communication is handled by the Pyro package, which must be installed before this module can be used. Pyro can be obtained from http://pyro.sourceforge.net/. By default, the Pyro name server is used to initialize communication. Please read the Pyro documentation for learning how to use the name server.

The principle of the master-slave model is that there is a single master process that defines computational tasks and any number of slave processes that execute these tasks. The master defines task requests and then waits for the results to come in. The slaves wait for a task request, execute it, return the result, and wait for the next task. There can be any number of slave processes, which can be started and terminated independently, the only condition being that no slave process can be started before its master process. This setup makes it possible to perform a lengthy computation using a variable number of processors.

Communication between the master and the slave processes passes through a TaskManager object that is created automatically as part of the master process. The task manager stores and hands out task requests and results. The task manager also keeps track of the slave processes. When a slave process disappears (because it was killed or because of a hardware failure), the task manager re-schedules its active task(s) to another slave process. This makes the master-slave system very fault tolerant.

Each task manager has a label that makes it possible to distinguish between several master-slave groups running at the same time. It is by the label that slave processes identify the master process for which they work.

The script "task_manager" prints statistics about a currently active task manager; it takes the label as an argument. It shows the number of currently active processes (master plus slaves), the number of waiting and running tasks, and the number of results waiting to be picked up.

The script Examples/master_slave_demo.py illustrates the use of the master-slave setup in a simple script. Both master and slave processes are defined in the same script. The scripts Examples/master.py and Examples/slave.py show a master-slave setup using two distinct scripts. This is more flexible because task requests and result retrievals can be made from anywhere in the master code.

Classes
  GlobalStateValue
  MasterProcess
Master process in a master-slave setup
  SlaveProcess
Slave process in a master-slave setup
Functions
 
getMachineInfo()
MasterProcess
initializeMasterProcess(label, slave_script=None, slave_module=None, use_name_server=True)
Initializes a master process.
 
runJob(label, master_class, slave_class, watchdog_period=120.0, launch_slaves=0)
Creates an instance of the master_class and runs it.
 
startSlaveProcess(label=None, master_host=None)
Starts a slave process.
Variables
  debug = False
Function Details

initializeMasterProcess(label, slave_script=None, slave_module=None, use_name_server=True)

 

Initializes a master process.

Parameters:
  • label (str) - the label that identifies the task manager
  • slave_script (str) - the file name of the script that defines the corresponding slave process
  • slave_module (str) - the name of the module that defines the corresponding slave process
  • use_name_server (bool) - If True (default), the task manager is registered with the Pyro name server. If False, the name server is not used and slave processes need to know the host on which the master process is running.
Returns: MasterProcess
a process object on which the methods requestTask() and retrieveResult() can be called.

runJob(label, master_class, slave_class, watchdog_period=120.0, launch_slaves=0)

 

Creates an instance of the master_class and runs it. A copy of the script and the current working directory are stored in the TaskManager object to enable the task_manager script to launch slave processes.

Parameters:
  • label (str) - the label that identifies the task manager
  • master_class - the class implementing the master process (a subclass of MasterProcess)
  • slave_class - the class implementing the slave process (a subclass of SlaveProcess)
  • watchdog_period (int or NoneType) - the interval (in seconds) at which the slave process sends messages to the manager to signal that it is still alive. If None, no messages are sent at all. In that case, the manager cannot recognize if the slave job has crashed or been killed.
  • launch_slaves (int) - the number of slaves jobs to launch immediately on the same machine that runs the master process

startSlaveProcess(label=None, master_host=None)

 

Starts a slave process. Must be called at the end of a script that defines or imports all task handlers.

Parameters:
  • label (str or NoneType) - the label that identifies the task manager. May be omitted if the slave process is started through the task_manager script.
  • master_host (str or NoneType) - If None (default), the task manager of the master process is located using the Pyro name server. If no name server is used, this parameter must be the hostname of the machine on which the master process runs, plus the port number if it is different from the default (7766).