File: mpirun.4

package info (click to toggle)
mpich 1.1.0-3
  • links: PTS
  • area: main
  • in suites: hamm
  • size: 22,116 kB
  • ctags: 27,349
  • sloc: ansic: 193,435; sh: 11,172; fortran: 6,545; makefile: 5,801; cpp: 5,020; tcl: 3,548; asm: 3,536; csh: 1,079; java: 614; perl: 183; awk: 168; sed: 70; f90: 62
file content (227 lines) | stat: -rw-r--r-- 6,003 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
.TH mpirun 4 "8/29/1995" " " "MPE"
.SH NAME
mpirun \- Run mpi programs

.SH DESCRIPTION
"mpirun" is a shell script that attempts to hide the differences in
starting jobs for various devices from the user. Mpirun attempts to
determine what kind of machine it is running on and start the required
number of jobs on that machine. On workstation clusters, if you are
not using Chameleon, you must supply a file that lists the different
machines that mpirun can use to run remote jobs or specify this file
every time you run mpirun with the -machine file option. The default
file is in util/machines/machines.<arch>.

mpirun typically works like this
.nf
 mpirun -np <number of processes> <program name and arguments>
.fi


If mpirun can't determine what kind of machine you are on, and it
is supported by the mpi implementation, you can the -machine
and -arch options to tell it what kind of machine you are running
on. The current valid values for machine are

.nf
          chameleon (including chameleon/pvm, chameleon/p4, etc...)
          meiko     (the meiko device on the meiko)
          paragon   (the ch_nx device on a paragon not running NQS)
          p4        (the ch_p4 device on a workstation cluster)
          ibmspx    (ch_eui for IBM SP2)
          anlspx    (ch_eui for ANL's SPx)
          ksr       (ch_p4 for KSR 1 and 2)
          sgi_mp    (ch_shmem for SGI multiprocessors)
          cray_t3d  (t3d for Cray T3D)
          smp       (ch_shmem for SMPs)
          execer    (a custom script for starting ch_p4 programs
	             without using a procgroup file. This script
                     currently does not work well with interactive
    	             jobs)
.fi

You should only have to specify mr_arch if mpirun does not recognize
your machine, the default value is wrong, or you are using the p4 or
execer devices.  The full list of options is

.SH PARAMETERS
The options for mpirun must come before the program you want to run

mpirun [mpirun_options...] <progname> [options...]

.PD 0
.TP
.B -arch 
<architecture>
specify the architecture (must have matching machines.<arch>
file in ${MPIR_HOME}/util/machines) if using the execer
.PD 1
.PD 0
.TP
.B -h 
This help
.PD 1
.PD 0
.TP
.B -machine 
<machine name>
use startup procedure for <machine name>
.PD 1
.PD 0
.TP
.B -machinefile 
<machine-file name>
Take the list of possible machines to run on from the
file <machine-file name>
.PD 1
.PD 0
.TP
.B -np 
<np>
specify the number of processors to run on
.PD 1
.PD 0
.TP
.B -nolocal

don't run on the local machine (only works for 
p4 and ch_p4 jobs)
.PD 1
.PD 0
.TP
.B -e 
Use execer to start the program on workstation
clusters
.PD 1
.PD 0
.TP
.B -pg 
Use a procgroup file to start the p4 programs, not execer
(default)
.PD 1
.PD 0
.TP
.B -leave_pg

Don't delete the P4 procgroup file after running
.PD 1
.PD 0
.TP
.B -p4pg 
filename
Use the given p4 procgroup file instead of creating one.
Overrides -np and -nolocal, selects -leave_pg.
.PD 1
.PD 0
.TP
.B -p4ssport 
num
Use the p4 secure server with port number num to start the
programs.  If num is 0, use the value of the 
environment variable MPI_P4SSPORT.  Using the server can
speed up process startup.  If MPI_USEP4SSPORT as well as
.PD 1
MPI_P4SSPORT are set, then that has the effect of giving
mpirun the -p4ssport 0 parameters.
.PD 0
.TP
.B -t 
Testing - do not actually run, just print what would be
executed
.PD 1
.PD 0
.TP
.B -v 
Verbose - throw in some comments
.PD 1
.PD 0
.TP
.B -dbx 
Start the first process under dbx where possible
.PD 1
.PD 0
.TP
.B -gdb 
Start the first process under gdb where possible
(on the Meiko, selecting either -dbx or -gdb starts prun
under totalview instead)
.PD 1
.PD 0
.TP
.B -nopoll 
Do not use a polling-mode communication.
Available only on IBM SPx.
.PD 1
.PD 0
.TP
.B -mvhome 
Move the executable to the home directory.  This 
is needed when all file systems are not cross-mounted
Currently only used by anlspx
.PD 1
.PD 0
.TP
.B -mvback 
files
.PD 1
Move the indicated files back to the current directory.
Needed only when using -mvhome; has no effect otherwise.
.PD 0
.TP
.B -maxtime 
min
.PD 1
Maximum job run time in minutes.  Currently used only
by anlspx.  Default value is $max_time
.PD 0
.TP
.B -cac 
name
CAC for ANL scheduler.  Currently used only by anlspx.
If not provided will choose some valid CAC.
.PD 1

On exit, mpirun returns a status of zero unless mpirun detected a problem, in
which case it returns a non-zero status (currently, all are one, but this
may change in the future).

.SH SPECIFYING HETEROGENEOUS SYSTEMS

Multiple architectures may be handled by giving multiple '-arch' and '-np'
arguments.  For example, to run a program on 2 sun4s and 3 rs6000s, with
the local machine being a sun4, use
.nf
    mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program
.fi

This assumes that program will run on both architectures.  If different
executables are needed (as in this case), the string '%a' will be replaced
with the arch name. For example, if the programs are 'program.sun4' and
'program.rs6000', then the command is
.nf
    mpirun -arch sun4 -np 2 -arch rs6000 -np 3 program.%a
.fi

If instead the execuables are in different directories; for example,
'/tmp/me/sun4' and '/tmp/me/rs6000', then the command is
.nf
    mpirun -arch sun4 -np 2 -arch rs6000 -np 3 /tmp/me/%a/program
.fi

It is important to specify the architecture with '-arch' `before` specifying
the number of processors.  Also, the `first` '-arch' command must refer to the
processor on which the job will be started.  Specifically, if '-nolocal' is
`not` specified, then the first -arch must refer to the processor from which
mpirun is running.

This is incompletely implemented currently.

(You must have 'machines.<arch>' files for each arch that you use in the
'util/machines' directory.)

Another approach that may be used the the 'ch_p4' device is to create a
'procgroup' file directly.  See the MPICH Users Guide for more information.


.SH LOCATION
 util/README