File: README

package info (click to toggle)
mdm 0.1.3-3
links: PTS, VCS
area: main
in suites: trixie
size: 184 kB
sloc: ansic: 1,268; sh: 69; makefile: 52
file content (160 lines) | stat: -rw-r--r-- 4,992 bytes
parent folder | download | duplicates (3)
THE MIDDLEMAN PROJECT
=====================

Homepage          http://mdm.berlios.de/
Project Summary:  http://developer.berlios.de/projects/mdm/
Code Repository:  http://hg.berlios.de/repos/mdm
Forums:           http://developer.berlios.de/forum/?group_id=10680
Bug Tracker:      http://developer.berlios.de/bugs/?group_id=10680


ABOUT THE PROJECT
=================

The Middleman Project (mdm) aims to create utility programs that unleash
the power of multi-processor and multi-core computer systems.  It does
so by helping you parallelize your shell scripts, Makefiles, or any
other program that invoke external programs.


SOFTWARE REQUIREMENTS
=====================

To run mdm, you need a modern (2.6+) Linux system, GNU screen and the
ncurses library.  It should be easy to port to other Unix systems by
writing new /proc parsers and fixing some library incompatibilities.


BUILDING AND INSTALLING MDM
===========================

To build mdm, simply run "make" at the toplevel.  This project is simple
enough so that there is no need for autoconf and automake.  To install,
use "make install" as follows:

    $ make install PREFIX=/install/directory/prefix

Without the PREFIX override, "make install" installs mdm to /usr/local.


HOW DOES IT WORK?
=================

The philosophy behind mdm is that users should benefit from their
multi-core systems without making drastic changes to their shell
scripts.  With mdm, you annotate your scripts to specify which commands
might benefit from parallelization, and then you run it under the
supervision of the mdm system.  At runtime, the mdm system dynamically
discovers parallelization opportunities and run the annotated commands
in parallel as appropriate.


MDM IN 3 EASY STEPS
===================

Suppose you use the following shell script (encode.sh) for encoding your
music library.  It works, but it leaves your quad-core computer mostly
idle because it processes only one file at a time.

    #!/bin/bash
    for i in */*.wav
    do echo $i
       ffmpeg -i "$i" "${i%%.wav}.mp3"
    done

You can parallelize this shell script in three easy steps.

 1. Find commands that you think are suitable for parallel execution,
    and annotate them with mdm-run.  Here is the modified encode.sh:

    #!/bin/bash
    for i in *.wav
    do echo $i
       mdm-run ffmpeg -i "$i" "${i%%.wav}.mp3"
    done

 2. Specify the I/O behavior of your parallel commands in an iospec
    file.  You know ffmpeg reads from its -i option argument and writes
    to its command argument (w/o option), so this is what you write in
    your iospec file:

    ffmpeg R-i W

    You can skip this step if you are certain the parallel command
    cannot interfere with any other command in the script.

 3. Run the script under mdm.screen as follows:

    $ mdm.screen -c iospec encode.sh

    You should see a monitoring program (mdm-top) displaying the
    execution status of your parallel commands, and the encoding process
    should (hopefully) complete a lot sooner because you are giving all
    processing cores a good workout!


WHEN NOT TO ANNOTATE WITH MDM-RUN
=================================

There are a few cases where you should not annotate a command with
mdm-run.  They are:

 1. The command is a shell built-in,

 2. You need to know the exit status of the command, or

 3. You perform I/O redirection on the command


THE I/O SPECIFICATION FILE
==========================

The I/O specification file (iospec) specifies the I/O behavior of
programs.  The mdm system use these specifications to decide whether it
is okay to run two annotated commands at the same time.  Each line of
the file describes a program.  Here are a few examples:

    ffmpeg R-i W
    rm     W
    cc     W-o 0-c Rbusy R
    date   Wbusy

In plain English:

  * ffmpeg reads from the option argument of -i and writes to all its
    non-option arguments,

  * rm writes to all its non-option arguments,

  * cc writes to its -o argument, -c takes no arguments, reads from the
    (abstract) file "busy" and from its non-option arguments, and

  * date writes to the (abstract) file "busy".

Adding the abstract file "busy" to the iospec ensures that mdm will
never schedule the date command to run when any "cc" command is still
running (and vice versa).

Beware that the iospec format is subject to change in the future.


WHEN TO USE MDM-SYNC
====================

The mdm-sync command is just like mdm-run, except that it does not
submit the command for parallel execution.  Use mdm-sync to annotate a
command when you don't want it to run in parallel, but you think it
might interfere with a command annotated by mdm-run.


QUESTIONS?  COMMENTS?
=====================

Please feel free to leave questions or comments on the mdm support forum
on BerliOS <http://developer.berlios.de/forum/?group_id=10680>.  If you
prefer, you can also write to me at <cklin@users.berlios.de>.


-- 
Chuan-kai Lin <cklin@users.berlios.de>
Thu Mar 12 16:30:13 PDT 2009