1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
|
THE MIDDLEMAN PROJECT
=====================
Homepage http://mdm.berlios.de/
Project Summary: http://developer.berlios.de/projects/mdm/
Code Repository: http://hg.berlios.de/repos/mdm
Forums: http://developer.berlios.de/forum/?group_id=10680
Bug Tracker: http://developer.berlios.de/bugs/?group_id=10680
ABOUT THE PROJECT
=================
The Middleman Project (mdm) aims to create utility programs that unleash
the power of multi-processor and multi-core computer systems. It does
so by helping you parallelize your shell scripts, Makefiles, or any
other program that invoke external programs.
SOFTWARE REQUIREMENTS
=====================
To run mdm, you need a modern (2.6+) Linux system, GNU screen and the
ncurses library. It should be easy to port to other Unix systems by
writing new /proc parsers and fixing some library incompatibilities.
BUILDING AND INSTALLING MDM
===========================
To build mdm, simply run "make" at the toplevel. This project is simple
enough so that there is no need for autoconf and automake. To install,
use "make install" as follows:
$ make install PREFIX=/install/directory/prefix
Without the PREFIX override, "make install" installs mdm to /usr/local.
HOW DOES IT WORK?
=================
The philosophy behind mdm is that users should benefit from their
multi-core systems without making drastic changes to their shell
scripts. With mdm, you annotate your scripts to specify which commands
might benefit from parallelization, and then you run it under the
supervision of the mdm system. At runtime, the mdm system dynamically
discovers parallelization opportunities and run the annotated commands
in parallel as appropriate.
MDM IN 3 EASY STEPS
===================
Suppose you use the following shell script (encode.sh) for encoding your
music library. It works, but it leaves your quad-core computer mostly
idle because it processes only one file at a time.
#!/bin/bash
for i in */*.wav
do echo $i
ffmpeg -i "$i" "${i%%.wav}.mp3"
done
You can parallelize this shell script in three easy steps.
1. Find commands that you think are suitable for parallel execution,
and annotate them with mdm-run. Here is the modified encode.sh:
#!/bin/bash
for i in *.wav
do echo $i
mdm-run ffmpeg -i "$i" "${i%%.wav}.mp3"
done
2. Specify the I/O behavior of your parallel commands in an iospec
file. You know ffmpeg reads from its -i option argument and writes
to its command argument (w/o option), so this is what you write in
your iospec file:
ffmpeg R-i W
You can skip this step if you are certain the parallel command
cannot interfere with any other command in the script.
3. Run the script under mdm.screen as follows:
$ mdm.screen -c iospec encode.sh
You should see a monitoring program (mdm-top) displaying the
execution status of your parallel commands, and the encoding process
should (hopefully) complete a lot sooner because you are giving all
processing cores a good workout!
WHEN NOT TO ANNOTATE WITH MDM-RUN
=================================
There are a few cases where you should not annotate a command with
mdm-run. They are:
1. The command is a shell built-in,
2. You need to know the exit status of the command, or
3. You perform I/O redirection on the command
THE I/O SPECIFICATION FILE
==========================
The I/O specification file (iospec) specifies the I/O behavior of
programs. The mdm system use these specifications to decide whether it
is okay to run two annotated commands at the same time. Each line of
the file describes a program. Here are a few examples:
ffmpeg R-i W
rm W
cc W-o 0-c Rbusy R
date Wbusy
In plain English:
* ffmpeg reads from the option argument of -i and writes to all its
non-option arguments,
* rm writes to all its non-option arguments,
* cc writes to its -o argument, -c takes no arguments, reads from the
(abstract) file "busy" and from its non-option arguments, and
* date writes to the (abstract) file "busy".
Adding the abstract file "busy" to the iospec ensures that mdm will
never schedule the date command to run when any "cc" command is still
running (and vice versa).
Beware that the iospec format is subject to change in the future.
WHEN TO USE MDM-SYNC
====================
The mdm-sync command is just like mdm-run, except that it does not
submit the command for parallel execution. Use mdm-sync to annotate a
command when you don't want it to run in parallel, but you think it
might interfere with a command annotated by mdm-run.
QUESTIONS? COMMENTS?
=====================
Please feel free to leave questions or comments on the mdm support forum
on BerliOS <http://developer.berlios.de/forum/?group_id=10680>. If you
prefer, you can also write to me at <cklin@users.berlios.de>.
--
Chuan-kai Lin <cklin@users.berlios.de>
Thu Mar 12 16:30:13 PDT 2009
|