1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222
|
.\"
.\" cook - file construction tool
.\" Copyright (C) 1997 Peter Miller;
.\" All rights reserved.
.\"
.\" This program is free software; you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation; either version 2 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program; if not, write to the Free Software
.\" Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
.\"
.\" MANIFEST: User Guide, Cooking in Parallel
.\"
.H 1 "Cooking in Parallel"
Cook is able to use the dependency information in the cookbook to
schedule more than one recipe body at once, where they are independent.
In large projects this is almost always possible.
.P
Parallel processing is of most use on multi-processor systems.
There are cases, however, when running two jobs at once on a workstation
can take advantage of disk or network latencies.
.P
Parallel processing requires more resources than the simple case.
Because more commands are running, more CPU is required,
but also more virtual memory and more temporary file space.
You need to be sure that cooking in parallel is a sensible thing to be doing.
.H 2 "Command Line Option"
The \f[CW]-PARallel\fP option is used to tell Cook to run the recipe bodies in parallel.
By default, 4 jobs run in parallel. You may specify the number of jobs after the option
(\fIe.g.\fP \f[CW]--par=2\fP)
if you wish.
.H 2 "Cookbook Variable"
It is also possible to set the number of jobs from within the cookbook
by using the \f[CW]parallel_jobs\fP variable. This can be used to
automate the selection of the number of jobs, based on the current host
name:
.eB
if [not [defined parallel_jobs]] then
{
host = [os node];
if [in [host] cerberus] then
parallel_jobs = 3;
else if [in [host] zaphod] then
parallel_jobs = 2;
else if [in [host] hydra] then
parallel_jobs = 8;
}
.eE
In this way, the number of jobs will be set appropriately for each machine,
provided the number of jobs was not already set by the command line option.
.H 2 "Recipe Writing"
Most recipes run in parallel without difficulty, however some will
require special treatment. The problems arise from conflict for
resources \- usually temporary files.
.br
.ne 1i
.P
The simplest example of this is \fIyacc\fP(1).
The output filenames are hard-coded, even when you write a more general recipe:
.eB
%.c: %.y
single-thread yy.tab.c
{
[yacc] [yacc_flags] %.y;
sed "'s/[yY][yY]/%_/g'" yy.tab.c > [target];
rm yy.tab.c;
}
.eE
Replacing the \f[CW]YY\fP is a common method for getting more than one
yacc grammar into a program. We run into trouble with the
\f[CW]yy.tab.c\fP file because every one of the yacc grammars will need
to use the same temporary file name.
.P
The \f[CW]single-thread\fP clause tells cook to find something else to
do if it discovers that it wants do two of these at the same time.
.br
.ne 2i
.P
The temporary file name may not be so evident as in the yacc case. The
GNU Autoconf utilities use a number of temporary files in the current
directory, but none of them appear in the text of the recipes.
.eB
%: %.in: config.status
single-thread conftest.subs
{
CONFIG_FILES\e=[target] CONFIG_HEADERS\e= config.status;
}
.eE
It is common, if your project uses GNU Autoconf, to generate several
files in this way. Once the \f[CW]config.status\fP script is produced,
all of these files will then be candidates for cook to generate \- but
they can only be done one at a time.
.P
Other resources, such as tape drives, can also be described in the
\f[CW]single-thread\fP clause. You can do this by device name
(\fIe.g.\fP \f[CW]/dev/rmt/0\fP) or by some descriptive string. The
single threading is performed by mutually exclusive string sets, not by
inode.
.H 3 "Concurrent Execution Threads"
Each recipe, when its actions are executed,
is executed within an execution thread.
Execution threads share almost everything in common;
this includes all of the variables, the state of the ``set'' statement,
the stat cache, \fIetc\fP.
.P
If you need to create variable names, or temporary file names, which are
unique to a thread, use the \f[CW][thread-id]\fP variable. This
variable has a unique value for the life of a thread. No other
concurrent thread will have the same value.
.P
Note, however, that the \f[CW][thread-id]\fP values of completed threads
will be re-used; this ensures that when it is used to construct variable
names, the variables will be re-used. This prevents memory bloat
when cooking large projects.
.H 2 "File Locking"
The above discussion applies to utilities which perform no file locking,
and thus cannot detect or sequence multiple accesses to a resource.
Other programs, such as those which access databases, may have quite
capable file locking mechanisms and are able to manage multiple parallel
updates on their own, obviating the need for the \f[CW]single-thread\fP
clause.
.H 2 "Virtual Machine"
It is possible to simulate a parallel machine if you are on a network.
Cook is able to distribute tasks to computers on a network, if it is
given sufficient information.
.P
The first information Cook requires is the list of machines.
This is done using the \f[CW]parallel_hosts\fP variable.
\fBNote:\fP The tasks will be distributed amongst these machines
independent of the setting of the \f[CW]parallel_jobs\fP variable.
\fIi.e.\fP even if you are not doing parallel processing.
.eB
parallel_hosts = larry curly moe;
.eE
If you want to give one machine more wieghting than the others (say,
because it is twice as fast) you simply name it more than once. Cook
will use these names in round-robin fashion.
.H 3 "Remote Shell Command"
Cook uses the Berkeley \fIrsh\fP(1) command to invoke the remote
command. You can set the command, or the command and some options,
using the \f[CW]parallel_rsh\fP variable. The default value is
.eB
parallel_rsh = rsh;
.eE
In order to work in a useful way, Cook makes some assumptions
about your environment and your account:
.BL
.LI
That your system administrators allow \fIrsh\fP(1) to be used on your network.
.LI
That your account name is the same on \fIall\fP machines
(otherwise not even the \f[CW]rsh -l\fP \fIlogin-name\fP option will help).
.LI
That the \f[CW]/etc/hosts.equiv\fP file, or your \f[CW]~/.rhosts\fP file,
is set on \fIall\fP machines so that you don't need to give a password.
.LI
That all of the necessary files and directories are mounted in exactly
the same place on all of the machines; and that they are \fIthe same
files\fP on all machines, via NFS or similar.
Automounters can make this especially messy.
.LI
That your account start-up scripts set the necessary environment
settings, \fIe.g.\fP command search \f[CW]PATH\fP, without any
intervention required.
.LI
That all of the machines are of the same architecture, or that the
architecture doesn't matter.
.LI
That the system time is synchronised on all machines, using
\fIrdate\fP(1) from \fIcron\fP(8), or using NTP, or similar.
.LE
.H 3 "Limitations"
There are some inherent limitations in the \fIrsh\fP(1) protocol.
.BL
.LI
Your current environment variable settings are not transferred across.
Neither are \fIulimit\fP settings, \fIetc\fP. If any are important,
you need to write the cookbook to explicitly replicate them.
.LI
The exit status of the remote command is not reported in the exit status
of the \fIrsh\fP(1) command\*F. There are internal contortions used by
Cook to obtain the exit status; error about mysteriously named files
usually indicate that one or more of the above assumptions is being broken.
.FS
The Berkeley sources certainly don't contain code to do this.
Do any other vendors have a more useful implementation?
.FE
.LE
.H 3 "Host Binding"
In some cases, such as licensing conditions,
some commands will only run on a limited set of hosts.
Rather than perform all commands on those hosts,
it is possible to bind recipes to specific hosts.
This binding overrides the \f[CW]parallel_hosts\fP variable.
.eB
%.c: %.esql
host-binding shylock
{
esql %.esql > [target];
}
.eE
This example says that the embedded SQL preprocessor is only to be run
on the database server called ``shylock'', probably due to usurious
licensing fees. However, you may want to perform your other development
activities on more lightly loaded machines; this clause only applies to
this one recipe, other recipes behave as normal.
.P
The \f[CW]host-binding\fP clause may have more than one host named, and
they will be used in round-robin fashion. This is a
recipe-level variant of the \f[CW]parallel_hosts\fP variable.
.P
The \f[CW]host-binding\fP clause will apply independent of the setting
of the settings \f[CW]parallel_jobs\fP and \f[CW]parallel_hosts\fP
variables.
|