File: BGQ.md

package info (click to toggle)
mpich 4.3.2-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 101,184 kB
  • sloc: ansic: 1,040,629; cpp: 82,270; javascript: 40,763; perl: 27,933; python: 16,041; sh: 14,676; xml: 14,418; f90: 12,916; makefile: 9,270; fortran: 8,046; java: 4,635; asm: 324; ruby: 103; awk: 27; lisp: 19; php: 8; sed: 4
file content (461 lines) | stat: -rw-r--r-- 17,277 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
# BGQ

This page describes how to build mpich from the
[main](https://github.com/pmodels/mpich/tree/main)
branch of the MPICH git [repository](https://github.com/pmodels/mpich) repository on
git.mpich.org.

## Blue Gene/Q Build Instructions

A bgq toolchain must be specified to cross compile the mpich source. The
toolchain can be specified explicitly by setting the CC, CXX, and other
environment variables to the desired compilers, or configure will detect
and use the cross compilers that are specified in the PATH environment
variable.

### Allow Configure To Determine The Compilers

The configure is simpler when specifying the $PATH, however the
generated mpi compile scripts such as `${prefix}/bin/mpicc` will also
not contain the path information to the cross compiler. This means users
of `mpicc` must also have the cross compiler in their $PATH for the
compile script to work.

The current compiler search order in mpich is specified in various m4
macro files. For example, the C compiler search path is specified in the
[confdb/aclocal_cc.m4](https://github.com/pmodels/mpich/blob/main/confdb/aclocal_cc.m4)
file.

```
AC_PROG_CC([icc pgcc xlc xlC pathcc gcc clang cc])
```

This means that on bgq the following compilers will be searched for in
the user's $PATH and the first found will be used. Care should be taken
when modifying the $PATH environment variable - especially when multiple
toolchains are specified.

1.  `powerpc64-bgq-linux-icc`
2.  `powerpc64-bgq-linux-pgcc`
3.  `powerpc64-bgq-linux-xlc`
4.  `powerpc64-bgq-linux-xlC`
5.  `powerpc64-bgq-linux-pathcc`
6.  `powerpc64-bgq-linux-gcc`
7.  `powerpc64-bgq-linux-clang`
8.  `powerpc64-bgq-linux-cc`

#### BGQ GNU Toolchain

To use the gnu toolchain installed with the BGQ system software, edit
the user's `$PATH` as below.

##### GCC Version 4.4.6

This is the default toolchain installed with the bgq system software.

```
export PATH=$PATH:/bgsys/drivers/V1R2M1/ppc64/gnu-linux/bin
```

##### GCC Version 4.7.2

The 4.7.2 version of the gnu toolchain is unsupported by IBM, however
instructions for building and installing a 4.7.2 toolchain are provided
with the system software. See the
`/bgsys/drivers/V1R2M1/ppc64/toolchain-4.7.2/README.toolchain` file for
more information.

The user's `$PATH` must be similarly modified once the 4.7.2 toolchain is
installed.

```
export PATH=$PATH:${GNU_TOOLCHAIN_4_7_2}/gnu-linux-4.7.2/bin
```

#### BGQ XL Toolchain

Beginning with V1R2M1 appropriately named symlinks to the bg xl
compilers are provided with the installed system software.

```
$ ls -lah /bgsys/drivers/ppcfloor/gnu-linux/powerpc64-bgq-linux/bin/*xl*
lrwxrwxrwx 1 root root 35 May 20 16:19 powerpc64-bgq-linux-xlc -> /opt/ibmcmp/vac/bg/12.1/bin/bgxlc_r
lrwxrwxrwx 1 root root 37 May 20 16:19 powerpc64-bgq-linux-xlC -> /opt/ibmcmp/vacpp/bg/12.1/bin/bgxlC_r
lrwxrwxrwx 1 root root 35 May 20 16:19 powerpc64-bgq-linux-xlf -> /opt/ibmcmp/xlf/bg/14 .1/bin/bgxlf_r
lrwxrwxrwx 1 root root 39 May 20 16:19 powerpc64-bgq-linux-xlf2003 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf2003_r
lrwxrwxrwx 1 root root 39 May 20 16:19 powerpc64-bgq-linux-xlf2008 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf2008_r
lrwxrwxrwx 1 root root 37 May 20 16:19 powerpc64-bgq-linux-xlf90 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf90_r
lrwxrwxrwx 1 root root 37 May 20 16:19 powerpc64-bgq-linux-xlf95 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf95_r
```

To use the bg xl compilers, the user's path must be modified as below.

```
export PATH=$PATH:/bgsys/drivers/ppcfloor/gnu-linux/powerpc64-bgq-linux/bin
```

Prior to V1R2M1 the symlinks to the bg xl compilers are not provided.
The solution is to simply create the symlinks in some other directory
and modify the user's $PATH accordingly.

```
$ ls -lah ${HOME}/bgxl/powerpc64-bgq-linux/bin | grep ^l
lrwxrwxrwx 1 johndoe johndoe   35 Jul 18 07:30 powerpc64-bgq-linux-xlc -> /opt/ibmcmp/vac/bg/12.1/bin/bgxlc_r
lrwxrwxrwx 1 johndoe johndoe   37 Jul 18 07:30 powerpc64-bgq-linux-xlC -> /opt/ibmcmp/vacpp/bg/12.1/bin/bgxlC_r
lrwxrwxrwx 1 johndoe johndoe   35 Jul 18 07:30 powerpc64-bgq-linux-xlf -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf_r
lrwxrwxrwx 1 johndoe johndoe   39 Jul 18 07:30 powerpc64-bgq-linux-xlf2003 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf2003_r
lrwxrwxrwx 1 johndoe johndoe   39 Jul 18 07:30 powerpc64-bgq-linux-xlf2008 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf2008_r
lrwxrwxrwx 1 johndoe johndoe   37 Jul 18 07:30 powerpc64-bgq-linux-xlf90 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf90_r
lrwxrwxrwx 1 johndoe johndoe   37 Jul 18 07:30 powerpc64-bgq-linux-xlf95 -> /opt/ibmcmp/xlf/bg/14.1/bin/bgxlf95_r

$ export PATH=$PATH:${HOME}/bgxl/powerpc64-bgq-linux/bin
```

#### BGQ Clang Toolchain

A bgq version of the clang/llvm compilers, provided by ANL, can be
installed and similarly referenced.

See <http://www.alcf.anl.gov/user-guides/bgclang-compiler> and
<https://trac.alcf.anl.gov/projects/llvm-bgq> for more information.

```
export PATH=$PATH:/home/projects/llvm/wbin/bgclang/bin
```

### Override Configure Settings With The compiler Environment Variables

The relevant compiler environment variables can be directly specified on
the configure command. This will override any compiler settings
determined by configure.

This approach will specify the full path to the compilers in the
generated `${prefix}/bin/mpicc` script. Users will **not** be required
to have the bgq cross compilers in their `$PATH`.

```
$ CC=/soft/compilers/ibmcmp-feb2014/vac/bg/12.1/bin/bgxlc_r                 \
CXX=/soft/compilers/ibmcmp-feb2014/vacpp/bg/12.1/bin/bgxlC_r                \
F77=/soft/compilers/ibmcmp-feb2014/xlf/bg/14.1/bin/bgxlf_r                  \
FC=/soft/compilers/ibmcmp-feb2014/xlf/bg/14.1/bin/bgxlf90_r                 \
AR=/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/bin/ar         \
LD=/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/bin/ld         \
RANLIB=/bgsys/drivers/V1R2M1/ppc64/gnu-linux/powerpc64-bgq-linux/bin/ranlib \
configure --host=powerpc64-bgq-linux --with-device=pamid
```

### Required

Specify the bgq cross compile and pamid device

```
--host=powerpc64-bgq-linux
--with-device=pamid
```

Customize the ROMIO file system.

The latest versions of mpich include a ROMIO implementation tuned for
GPFS and further optimized for the Blue Gene environment.

```
--with-file-system=gpfs:BGQ
```

In mpich 3.1 and earlier versions the GPFS adio is unavailable and
instead included a unique Blue Gene adio implementation.

```
--with-file-system=bg+bglockless
```

### Optional

#### Customize The Required BGQ System Software Libraries

The latest installed bgq system software is used by default. The
location of the bgq system software can also be specified with the
configure option below or the `BGQ_INSTALL_DIR` environment
variable.

```
--with-bgq-install-dir=/bgsys/drivers/V1R2M0/ppc64
```

A pami installation outside of the bgq system software directory may be
specified using the `--with-pami` configure option(s). For example:

```
--with-pami=/bgsys/drivers/V1R2M0/ppc64/comm/sys
--with-pami-include=/bgsys/drivers/V1R2M0/ppc64/comm/sys/include
--with-pami-lib=/bgsys/drivers/V1R2M0/ppc64/comm/sys/lib
```

#### Customize The BGQ Cross Compile Settings

A different cross compile settings file for bgq pamid can be specified
using the `--with-cross-file` configure option. Below is the configure
option that specifies what is the default cross file for a bgq pamid
configuration.

```
--with-cross-file=src/mpid/pamid/cross/bgq8
```

#### Disable rpath

When shared libraries are installed it is recommended to also *disable*
the "wrapper rpath" configure option in order to take advantage of a
shared library load optimization on the bgq io nodes.

```
--disable-wrapper-rpath
```

When a million processes each individually read from the filesystem the
performance of the shared library load will be poor. The io node shared
library optimization is a way to "stage" shared libraries on a bgq io
node ramfs directory that is, in the absence of rpath information,
searched first by the bgq loader. Any rpath information will be searched
before this io node ramfs location and will result in a query all the
way down to the filesystem.

Shared libraries can be added to the io node ramfs directory by
packaging the libraries into a `\*.tar.gz` file and copying that file
into the `/bgsys/linux/bgfs` directory.

#### Enable Common "no debug" And "performance" Options

The *xl.ndebug* and *xl.legacy.ndebug* mpich versions installed with the
bgq system software use the following options to eliminate debug and
other error checks that would cause performance degradations.

```
--enable-fast=nochkmsg,notiming,O3
--with-assert-level=0
--disable-error-messages
--disable-debuginfo
```

#### Enable Fine Grain Locking

The *gcc*, *xl*, and *xl.ndebug* mpich versions installed with the bgq
system software use the following options to enable fine grain locking
and synchronous progress mode.

```
--enable-thread-cs=per-object
--with-atomic-primitives
--enable-handle-allocation=tls
--enable-refcount=lock-free
--disable-predefined-refcount
```

#### Specify Compiler Options

The following compiler flags are used when compiling the mpich library
to be included in the bgq system software installation and are provided
here for guidance.

##### GNU Compiler Options

```
MPICHLIB_CXXFLAGS="-Wall -Wno-unused-function -Wno-unused-label -Wno-unused-variable -fno-strict-aliasing"
MPICHLIB_CFLAGS="${MPICHLIB_CXXFLAGS} -Wno-implicit-function-declaration"
MPICHLIB_FFLAGS="${MPICHLIB_CXXFLAGS}"
MPICHLIB_F90FLAGS="${MPICHLIB_CXXFLAGS}"
```

##### XL Compiler Options

```
MPICHLIB_CXXFLAGS="-qhot -qinline=800 -qflag=i:i -qsaveopt -qsuppress=1506-236"
MPICHLIB_CFLAGS="${MPICHLIB_CXXFLAGS}"
MPICHLIB_FFLAGS="${MPICHLIB_CXXFLAGS}"
MPICHLIB_F90FLAGS="${MPICHLIB_CXXFLAGS}"
```

## Blue Gene/Q Mpich Testsuite Instructions

From a filesystem location that is accessible to the Blue Gene/Q io
nodes, for example `/bgusr/johndoe`, invoke the configure script in the
`test/mpi` directory of the mpich source.

```
$ mkdir /home/johndoe/testsuite && cd /home/johndoe/testsuite
$ /home/johndoe/mpich/test/mpi/configure --srcdir=/home/johndoe/mpich/test/mpi --with-mpi=/home/johndoe/mpich/install
```

The `--srcdir` configure option specifies the location of the testsuite
source, the `--with-mpi` configure option specifies which mpi
installation to use when compiling the tests. Other configure options
may be specified, such as `--disable-spawn`, to skip unsupported
functions.

Once configured, the tests can be compiled and executed using the 
`make testing` makefile rule. Specific make variables will need to be
specified depending on how the jobs are to be launched on a Blue Gene/Q
system.

The `MPIEXEC` variable is needed to specify the job launch mechanism,
which on Blue Gene/Q is the `runjob` command. For more information on
the `runjob` command see chapter 6, **"Submitting jobs"** in the **[IBM
System Blue Gene Solution: Blue Gene/Q System
Administration](http://www.redbooks.ibm.com/redbooks/pdfs/sg247869.pdf)**
redbook.

The `MPITEST_PROGRAM_WRAPPER` variable is needed to supply additional
information to the `runjob` command. This "wrapper" text is inserted
**after** the `$MPIEXEC` command and its arguments, such as the number
of processes in the job, and **before** the name of the test binary to
launch. At a minimum the *runjob* command needs to have the compute
block specified and the ':' separator character specified. Other
`runjob` options can be specified as well, such as `--timeout`, although
these are not required to launch the job.

### `bg_console`

Before testing with `runjob`, and directly launching the jobs on a Blue
Gene/Q system, the compute block must be allocated. Typically this is
done using the `bg_console` command shell. For more information on
`bg_console` see section **"Creating and booting I/O blocks and compute
blocks"** in the **[IBM System Blue Gene Solution: Blue Gene/Q System
Administration](http://www.redbooks.ibm.com/redbooks/pdfs/sg247869.pdf)**
redbook.

To begin testing, change to the directory where the configure command
was run (`/bgusr/johndoe/testsuite` in this example), locate the block
that was booted (`R00-M1-N06` in this example), and invoke the following
command:

```
$ cd /home/johndoe/testsuite
$ make testing MPITEST_PROGRAM_WRAPPER=" --block R00-M1-N06 : " MPIEXEC=runjob
```

### Cobalt

Before testing with `runjob`, and directly launching the jobs on a Blue
Gene/Q system, the job must be launched in "interactive" mode. This will
fork a new shell with several important environment variables set.

```
$ qsub -A aurora_app -t 120 -n 32 --mode interactive
Wait for job 294154 to start...
Opening interactive session to VST-02230-13331-32
```

Additional runjob options may be needed if the Blue Gene/Q installation
has changed the default behavior of the trace loggers

```
--verbose ibm.runjob=0 --verbose 0`
```

in this example. To begin testing, change to the directory where the 
configure command was run (`/bgusr/johndoe/testsuite` in this example) and invoke the following
command:

```
$ cd /home/johndoe/testsuite
$ make testing MPITEST_PROGRAM_WRAPPER=" --block $COBALT_PARTNAME --timeout 60 --verbose ibm.runjob=0 --verbose 0 : " MPIEXEC=runjob
```

## Blue Gene/Q development instructions

The product release branches in the
[mpich-ibm.git](http://git.mpich.org/mpich-ibm.git) git repository are
based on the mpich2 1.5 release, and for esoteric historical reasons,
the code in the repository is located in a *mpich2* subdirectory that
does not exist in the original mpich source. This extra directory makes
a simple *git cherry-pick* of a commit on a Blue Gene/Q release branch
on to another mpich branch challenging.

### How To Migrate Commits From A Previous Blue Gene/Q Release Branch

#### Use `git format-patch` To Create Patch Files For Each Commit

For example:

```
git checkout BGQ/IBM_V1R2M0
Checking out files: 100% (8574/8574), done.
Branch BGQ/IBM_V1R2M0 set up to track remote branch BGQ/IBM_V1R2M0 from origin.
Switched to a new branch 'BGQ/IBM_V1R2M0'

git format-patch HEAD~4
0001-CPS-92XKPE-remove-fortran-interface-for-MPIX_Pset_io.patch
0002-CPS-92XKPE-Do-not-use-the-MPIX_Pset_io_node-function.patch
0003-CPS-97VH5U-do-not-disable-short-synchronous-sends.patch
0004-CPS-97RGJN-PAMID-only-fix-for-multi-threaded-MPI_Ibs.patch
```

#### Use `git am` To Apply Each Commit

You may need to edit the commit message into an acceptable format using
`git commit --amend`.

  - If the patch contains the leading `mpich2/` directory then this
    directory must be removed as the patch is applied by using the
    `-p2` option; for example:

```
git am -p2 0001-CPS-92XKPE-remove-fortran-interface-for-MPIX_Pset_io.patch
```

  - The "summary" line of the commit message must not contain any IBM
    "breadcrumbs" such as "Issue 1234", "CPS WXYZ", or "D12345". These
    breadcrumbs need to be moved to the body of the commit message and
    prepended with the "(ibm)" namespace. It is good form to add the
    original commit as well. This helps when tracing the history of the
    code change via gitweb, etc. For example,

```
git log -n1 b68401e3c6ba3bbd2cc0626dac1604242a20f989 # The original commit to be migrated
commit b68401e3c6ba3bbd2cc0626dac1604242a20f989
Author: Michael Blocksome <blocksom@us.ibm.com>
Date:   Mon Apr 29 13:14:35 2013 -0500

    CPS 92XKPE: remove fortran interface for MPIX_Pset_io_node()
    
    The MPIX_Pset_io_node() function has been deprecated.

git commit --amend
git log -n1
commit ee5e30e4ed5cddd10e2ebf72087174284d8d590b
Author: Michael Blocksome <blocksom@us.ibm.com>
Date:   Mon Apr 29 13:14:35 2013 -0500

    Remove fortran interface for MPIX_Pset_io_node()
    
    The MPIX_Pset_io_node() function has been deprecated.

    (ibm) CPS 92XKPE
    (ibm) b68401e3c6ba3bbd2cc0626dac1604242a20f989
```

#### Use `git apply` To Repair Any Merge Conflicts

If `git am` fails to apply a patch it must be applied manually. The
`git am` command places the current patch in the
`.git/rebase_apply/` directory in a file named `0001`. The `git apply` 
command must be used on this patch to create "reject" files that
can be used to manually repair the files:

```
git apply .git/rebase_apply/0001 -p2 --reject
    # edit edit edit
git add FIXED_FILES
git am --resolved
```

### How To Migrate Commits From MPICH Master To Blue Gene/Q Release Branch

The process is much the same as above:

  - `git format-patch` to get the changes in question
  - `git am to apply changes`, but use the `--directory=mpich2` flag to
    indicate the Blue Gene/Q release branch tree lives one directory
    lower. No need for the `-p2` flag.
  - if there are conflicts, generate rejects with 
    `git apply --directory=mpich2 .git/rebase-apply/0001 --reject`