File: misc.tex

package info (click to toggle)
lam 7.1.4-8
  • links: PTS
  • area: main
  • in suites: forky, sid
  • size: 56,404 kB
  • sloc: ansic: 156,541; sh: 9,991; cpp: 7,699; makefile: 5,621; perl: 488; fortran: 260; asm: 83
file content (451 lines) | stat: -rw-r--r-- 18,019 bytes parent folder | download | duplicates (10)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
% -*- latex -*-
%
% Copyright (c) 2001-2004 The Trustees of Indiana University.  
%                         All rights reserved.
% Copyright (c) 1998-2001 University of Notre Dame. 
%                         All rights reserved.
% Copyright (c) 1994-1998 The Ohio State University.  
%                         All rights reserved.
% 
% This file is part of the LAM/MPI software package.  For license
% information, see the LICENSE file in the top level directory of the
% LAM/MPI source distribution.
%
% $Id: misc.tex,v 1.20 2003/08/12 01:10:28 jsquyres Exp $
%

\chapter{Miscellaneous}
\label{sec:misc}

This chapter covers a variety of topics that don't conveniently fit
into other chapters.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Singleton MPI Processes}

It is possible to run an MPI process without the \cmd{mpirun} or
\cmd{mpiexec} commands -- simply run the program as one would normally
launch a serial program:

\lstset{style=lam-cmdline}
\begin{lstlisting}
shell$ my_mpi_program
\end{lstlisting}
% Stupid emacs mode: $

Doing so will create an \mpiconst{MPI\_\-COMM\_\-WORLD} with a single
process.  This process can either run by itself, or spawn or connect
to other MPI processes and become part of a larger MPI jobs using the
MPI-2 dynamic function calls.  A LAM RTE must be running on the local
node, as with jobs started with \cmd{mpirun}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI-2 I/O Support}
\index{ROMIO}
\index{MPI-2 I/O support|see {ROMIO}}
\index{I/O support|see {ROMIO}}

MPI-2 I/O support is provided through the ROMIO
package~\cite{thak99a,thak99b}.  Since support is provided through a
third party package, its integration with LAM/MPI is not ``complete.''
Specifically, everywhere the MPI-2 standard specifies an argument of
type \mpitype{MPI\_\-Request}, ROMIO's provided functions expect an
argument of type \mpitype{MPIO\_\-Request}.

Note, too, that the \mpitype{MPIO\_\-Request} types cannot be used
with LAM's standard \mpifunc{MPI\_\-TEST} and \mpifunc{MPI\_\-WAIT}
functions -- ROMIO's \mpifunc{MPIO\_\-TEST} and \mpifunc{MPIO\_\-WAIT}
functions must be used instead.  There are no array versions of these
functions (e.g., \mpifunc{MPIO\_\-TESTANY}, \mpifunc{MPIO\_\-WAITANY},
etc., do not exist).

C MPI applications wanting to use MPI-2 I/O functionality can simply
include \file{mpi.h}.  Fortran MPI applications, however, must include
both \file{mpif.h} and \file{mpiof.h}.

Finally, ROMIO includes its own documentation and listings of known
issues and limitations.  See the \file{README} file in the ROMIO
directory in the LAM distribution.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Fortran Process Names}
\index{fortran process names}
\cmdindex{mpitask}{fortran process names}

Since Fortran does not portably provide the executable name of the
process (similar to the way that C programs get an array of {\tt
  argv}), the \icmd{mpitask} command lists the name ``LAM MPI Fortran
program'' by default for MPI programs that used the Fortran binding
for \mpifunc{MPI\_\-INIT} or \mpifunc{MPI\_\-INIT\_\-THREAD}.

The environment variable \ienvvar{LAM\_\-MPI\_\-PROCESS\_\-NAME} can
be used to override this behavior.
%
Setting this environment variable before invoking \icmd{mpirun} will
cause \cmd{mpitask} to list that name instead of the default title.
%
This environment variable only works for processes that invoke the
Fortran binding for \mpifunc{MPI\_\-INIT} or
\mpifunc{MPI\_\-INIT\_\-THREAD}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI Thread Support}
\label{sec:misc-threads}
\index{threads and MPI}
\index{MPI and threads|see {threads and MPI}}

\def\mtsingle{\mpiconst{MPI\_\-THREAD\_\-SINGLE}}
\def\mtfunneled{\mpiconst{MPI\_\-THREAD\_\-FUNNELED}}
\def\mtserial{\mpiconst{MPI\_\-THREAD\_\-SERIALIZED}}
\def\mtmultiple{\mpiconst{MPI\_\-THREAD\_\-MULTIPLE}}
\def\mpiinit{\mpifunc{MPI\_\-INIT}}
\def\mpiinitthread{\mpifunc{MPI\_\-INIT\_\-THREAD}}

LAM currently implements support for \mtsingle, \mtfunneled, and
\mtserial.  The constant \mtmultiple\ is provided, although LAM will
never return \mtmultiple\ in the \funcarg{provided} argument to
\mpiinitthread.

LAM makes no distinction between \mtsingle\ and \mtfunneled.  When
\mtserial\ is used, a global lock is used to ensure that only one
thread is inside any MPI function at any time.

\subsection{Thread Level}

Selecting the thread level for an MPI job is best described in terms
of the two parameters passed to \mpiinitthread: \funcarg{requested}
and \funcarg{provided}.  \funcarg{requested} is the thread level that
the user application requests, while \funcarg{provided} is the thread
level that LAM will run the application with.

\begin{itemize}
\item If \mpiinit\ is used to initialize the job, \funcarg{requested}
  will implicitly be \mtsingle.  However, if the
  \ienvvar{LAM\_\-MPI\_\-THREAD\_\-LEVEL} environment variable is set
  to one of the values in Table~\ref{tbl:mpi-env-thread-level}, the
  corresponding thread level will be used for \funcarg{requested}.
  
\item If \mpiinitthread\ is used to initialized the job, the
  \funcarg{requested} thread level is the first thread level that the
  job will attempt to use.  There is currently no way to specify lower
  or upper bounds to the thread level that LAM will use.
  
  The resulting thread level is largely determined by the SSI modules
  that will be used in an MPI job; each module must be able to support
  the target thread level.  A complex algorithm is used to attempt to
  find a thread level that is acceptable to all SSI modules.
  Generally, the algorithm starts at \funcarg{requested} and works
  backwards towards \mpiconst{MPI\_\-THREAD\_\-SINGLE} looking for an
  acceptable level.  However, any module may {\em increase} the thread
  level under test if it requires it.  At the end of this process, if
  an acceptable thread level is not found, the MPI job will abort.
\end{itemize}
  
\begin{table}[htbp]
  \centering
  \begin{tabular}{|c|l|}
    \hline
    Value & \multicolumn{1}{|c|}{Meaning} \\
    \hline
    \hline
    undefined & \mtsingle \\
    0 & \mtsingle \\
    1 & \mtfunneled \\
    2 & \mtserial \\
    3 & \mtmultiple \\
    \hline
  \end{tabular}
  \caption{Valid values for the \envvar{LAM\_\-MPI\_\-THREAD\_\-LEVEL}
    environment variable.}
  \label{tbl:mpi-env-thread-level}
\end{table}

Also note that certain SSI modules require higher thread support
levels than others.  For example, any checkpoint/restart SSI module
will require a minimum of \mtserial, and will attempt to adjust the
thread level upwards as necessary (if that CR module will be used
during the job).

Hence, using \mpiinit\ to initialize an MPI job does not imply that
the provided thread level will be \mtsingle.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI-2 Name Publishing}
\index{published names}
\index{dynamic name publishing|see {published names}}
\index{name publising|see {published names}}

LAM supports the MPI-2 functions \mpifunc{MPI\_\-PUBLISH\_\-NAME} and
\mpifunc{MPI\_\-UNPUBLISH\_\-NAME} for publishing and unpublishing
names, respectively.  Published names are stored within the LAM
daemons, and are therefore persistent, even when the MPI process that
published them dies.  

As such, it is important for correct MPI programs to unpublish their
names before they terminate.  However, if stale names are left in the
LAM universe when an MPI process terminates, the \icmd{lamclean}
command can be used to clean {\em all} names from the LAM RTE.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Interoperable MPI (IMPI) Support}
\index{IMPI}
\index{Interoperable MPI|see {IMPI}}

The IMPI extensions are still considered experimental, and are
disabled by default in LAM.  They must be enabled when LAM is
configured and built (see the Installation Guide file for details).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Purpose of IMPI}

The Interoperable Message Passing Interface (IMPI) is a standardized
protocol that enables different MPI implementations to communicate
with each other. This allows users to run jobs that utilize different
hardware, but still use the vendor-tuned MPI implementation on each
machine. This would be helpful in situations where the job is too
large to fit in one system, or when different portions of code are
better suited for different MPI implementations.

IMPI defines only the protocols necessary between MPI implementations;
vendors may still use their own high-performance protocols within
their own implementations.

Terms that are used throughout the LAM / IMPI documentation include:
IMPI clients, IMPI hosts, IMPI processes, and the IMPI server. See the
IMPI section of the the LAM FAQ for definitions of these terms on the
LAM web site.\footnote{\url{http://www.lam-mpi.org/faq/}}

For more information about IMPI and the IMPI Standard, see the main
IMPI web site.\footnote{\url{http://impi.nist.gov/}}.

Note that the IMPI standard only applies to MPI-1 functionality.
Using non-local MPI-2 functions on communicators with ranks that live
on another MPI implementation will result in undefined behavior (read:
kaboom).  For example, \mpifunc{MPI\_\-COMM\_\-SPAWN} will certainly
fail, but \mpifunc{MPI\_\-COMM\_\-SET\_\-NAME} works fine (because it
is a local action).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Current IMPI functionality}
\index{IMPI!supported functionality}

LAM currently implements a subset of the IMPI functionality:

\begin{itemize}
\item Startup and shutdown

\item All MPI-1 point-to-point functionality
  
\item Some of the data-passing collectives:
  \mpifunc{MPI\_\-ALLREDUCE}, \mpifunc{MPI\_\-BARRIER},
  \mpifunc{MPI\_\-BCAST}, \mpifunc{MPI\_\-REDUCE}
\end{itemize}

LAM does not implement the following on communicators with ranks that
reside on another MPI implementation:

\begin{itemize}
\item \mpifunc{MPI\_\-PROBE} and \mpifunc{MPI\_\-IPROBE}

\item \mpifunc{MPI\_\-CANCEL}

\item All data-passing collectives that are not listed above

\item All communicator constructor/destructor collectives (e.g.,
  \mpifunc{MPI\_\-COMM\_\-SPLIT}, etc.)
\end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Running an IMPI Job}
\index{IMPI!running jobs}

Running an IMPI job requires the use of an IMPI
server.\index{IMPI!server} An open source, freely-available server is
available.\footnote{\url{http://www.osl.iu.edu/research/impi/}}

As described in the IMPI standard, the first step is to launch the
IMPI server with the number of expected clients.  The open source
server from above requires at least one authentication mechanism to be
specified (``none'' or ``key'').  For simplicity, these instructions
assume that the ``none'' mechanism will be used.  Only one IMPI server
needs to be launched per IMPI job, regardless of how many clients will
connect.
%
For this example, assume that there will be 2 IMPI clients; client 0
will be run in LAM/MPI, and client 1 will be run elsewhere.  

\lstset{style=lam-cmdline}
\begin{lstlisting}
shell$ export IMPI_AUTH_NONE=
shell$ impi_server -server 2 -auth 0
10.0.0.32:9283
\end{lstlisting}
% Stupid emacs mode: $

The IMPI server must be left running for the duration of the IMPI job.
%
The string that the IMPI server gives as output (``10.0.0.32:9283'',
in this case) must be given to \cmd{mpirun} when starting the LAM
process that will run in IMPI:

\lstset{style=lam-cmdline}
\begin{lstlisting}
shell$ mpirun -client 0 10.0.0.32:9283 C my_mpi_program
\end{lstlisting}
% Stupid emacs mode: $

This will run the MPI program in the local LAM universe and connect it
to the IMPI server.  From there, the IMPI protocols will take over and
join this program to all other IMPI clients.

Note that LAM will launch an auxiliary ``helper'' MPI program named
\cmd{impid} that will last for the duration of the IMPI job.  It acts
as a proxy to the other IMPI processes, and should not be manually
killed.  It will die on its own accord when the IMPI job is complete.
If something goes wrong, it can be killed with the \cmd{lamclean}
command, just like any other MPI process.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Complex Network Setups}

In some complex network configurations -- particularly those that span
multiple private networking domains -- it may necessary to override
the hostname that IMPI uses for connectivity (i.e., use something
other that what is returned by the \cmd{hostname} command).  In this
case, the \ienvvar{IMPI\_\-HOST\_\-NAME} can be used.  If set, this
variable is expected to contain a resolvable name (or IP address) that
should be used.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Batch Queuing System Support}
\label{sec:misc-batch}
\index{batch queue systems}
\index{Portable Batch System|see {batch queue systems}}
\index{PBS|see {batch queue systems}}
\index{PBS Pro|see {batch queue systems}}
\index{OpenPBS|see {batch queue systems}}
\index{Load Sharing Facility|see {batch queue systems}}
\index{LSF|see {batch queue systems}}
\index{Clubmask|see {batch queue systems}}

LAM is now aware of some batch queuing systems.  Support is currently
included for PBS, LSF, and Clubmask-based
systems.  There is also a generic functionality that allows users of
other batch queue systems to take advantages of this functionality.

\begin{itemize}
\item When running under a supported batch queue system, LAM will take
  precautions to isolate itself from other instances of LAM in
  concurrent batch jobs.  That is, the multiple LAM instances from the
  same user can exist on the same machine when executing in batch.
  This allows a user to submit as many LAM jobs as necessary, and even
  if they end up running on the same nodes, a \cmd{lamclean} in one
  job will not kill MPI applications in another job.
  
\item This behavior is {\em only} exhibited under a batch environment.
  Other batch systems can easily be supported -- let the LAM Team know
  if you'd like to see support for others included.  Manually setting
  the environment variable \ienvvar{LAM\_\-MPI\_\-SESSION\_\-SUFFIX}
  on the node where \icmd{lamboot} is run achieves the same ends.
 \end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Location of LAM's Session Directory}
\label{sec:misc-session-directory}
\index{session directory}

By default, LAM will create a temporary per-user session directory
in the following directory:

\centerline{\file{<tmpdir>/lam-<username>@<hostname>[-<session\_suffix>]}}

\noindent Each of the components is described below:

\begin{description}
\item[\file{<tmpdir>}]: LAM will set the prefix used for the session
  directory based on the following search order:

  \begin{enumerate}
    \item The value of the \ienvvar{LAM\_\-MPI\_\-SESSION\_\-PREFIX}
      environment variable

    \item The value of the \ienvvar{TMPDIR} environment variable

    \item \file{/tmp/}
  \end{enumerate}
  
  It is important to note that (unlike
  \ienvvar{LAM\_\-MPI\_\-SESSION\_\-SUFFIX}), the environment
  variables for determining \file{<tmpdir>} must be set on each node
  (although they do not necessarily have to be the same value).
  \file{<tmpdir>} must exist before \icmd{lamboot} is run, or
  \icmd{lamboot} will fail.

\item[\file{<username>}]: The user's name on that host.

\item[\file{<hostname>}]: The hostname.
  
\item[\file{<session\_suffix>}]: LAM will set the suffix (if any) used
  for the session directory based on the following search order:

  \begin{enumerate}

    \item The value of the \ienvvar{LAM\_\-MPI\_\-SESSION\_\-SUFFIX}
      environment variable.
  
    \item If running under a supported batch system, a unique session
      ID (based on information from the batch system) will be used.
  \end{enumerate}
\end{description}
  
\ienvvar{LAM\_\-MPI\_\-SESSION\_\-SUFFIX} and the batch information
only need to be available on the node from which \icmd{lamboot} is
run.  \icmd{lamboot} will propagate the information to the other
nodes.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Signal Catching}
\index{signals}

LAM MPI now catches the signals SEGV, BUS, FPE, and ILL.  The signal
handler terminates the application. This is useful in batch jobs to
help ensure that \icmd{mpirun} returns if an application process dies.
To disable the catching of signals use the \cmdarg{-nsigs} option to
\icmd{mpirun}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI Attributes}

\begin{discuss}
  Need to have discussion of built-in attributes here, such as
  MPI\_\-UNIVERSE\_\-SIZE, etc.  Should specifically mention that
  MPI\_\-UNIVERSE\_\-SIZE is fixed at \mpifunc{MPI\_\-INIT} time (at
  least it is as of this writing -- who knows what it will be when we
  release 7.1? :-).

  This whole section is for 7.1.
\end{discuss}