1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
|
Serial and MPI Builds
=====================
h5py is available for both serial and MPI builds of HDF5, provided by
python3-h5py-serial and python3-h5py-mpi, respectively. The serial
build is provided to avoid the library overheads associated with the
MPI build, when only serial use is required.
In order to provide flexibility, both can be installed at the same
time. The choice of which build to use is determined at runtime.
In general you would want a serial job to use HDF5-serial and an MPI
job to use HDF5-mpi. The installation is set up so that if the job is
run in a serial process, then the serial build is invoked
(h5py/_debian_h5py_serial).
If the job is run in an MPI process (openmpi or mpich), then the MPI
build is invoked (h5py/_debian_h5py_mpi). The test for MPI uses the
environment variables OMPI_COMM_WORLD_SIZE and MPI_LOCALNRANKS, set
automatically at runtime by OpenMPI and MPICH (mpirun), respectively.
Selection between serial and mpi builds of h5py is automatic by
default (performed in h5py/__init__.py). In normal usage you need
only specify "import h5py" as usual without having to specify if you
need the serial or mpi version.
You can check at runtime if mpi is active from h5py.get_config().mpi.
$ python3 -c "import h5py; print(h5py.get_config().mpi)"
should return "False" (serial mode)
$ mpirun -n 1 python3 -c "import h5py; print(h5py.get_config().mpi)"
should return "True" (MPI mode)
Enforcing use of h5py-mpi in a serial job
=========================================
There may be particular circumstances in which you want to access the
MPI build of HDF5 even from a serial job. This might be the case if
you have an app that accesses the HDF5 MPI API directly, and so is
built against libhdf5-*mpi.so not libhdf5-serial.so. If the same app
also uses h5py for convenience, then it should use the equivalent
build, i.e. h5py-mpi not h5py-serial, even if it might be run as a
serial job. Or you may have reason to access h5py's MPI API (file
handling with mpio) even from a serial job.
In this case you have three options for enforcing the use of the MPI
build in a serial job.
1) You could invoke as a single process MPI job
$ mpirun -n 1 python3 myscript.py
(technically this is not a serial job as such, but an MPI job with one
process)
2) You can force the use of the MPI build by setting
the environment variable H5PY_ALWAYS_USE_MPI.
3) Alternatively, You may import the MPI build directly, using
import h5py._debian_h5py_mpi as h5py
instead of
import h5py
Option 2 is probably the most explicit, but the other alternatives
will work.
Obviously python3-h5py-mpi must be installed for any of these to be
effective.
It is possible to install python3-h5py-mpi without installing
python3-h5py-serial. In this case h5py-mpi will be loaded in serial
jobs as a fallback, and no further action or environment variable
needs to be set.
Enforcing use of h5py-serial in an MPI job
==========================================
Normally h5py-serial is expected to be used in serial jobs, and
h5py-mpi is expected to be used in MPI jobs. But it is possible to
force the use of one in the runtime context of the other.
The fallback mechanism means that if the expected variant is not
found, then the other variant is loaded instead.
This fallback mechanism means that if python3-h5py-serial is installed
while python3-h5py-mpi is not installed, then MPI jobs will load
h5py-serial. The fallback works both ways: if python3-h5py-mpi is
installed while python3-h5py-serial is not installed, then serial jobs
will load h5py-mpi.
There may be particular circumstances in which you want to access the
serial build of HDF5 even from a MPI job. This might be the case if
you have an app that accesses the HDF5 API directly, and is
built against libhdf5-serial.so instead of libhdf5-*mpi.so not. If the same app
also uses h5py for convenience, then arguably it should use the equivalent
build, i.e. h5py-serial not h5py-mpi, even if it might be run as an
MPI job.
In this case you have two options for enforcing the use of the serial
build in a MPI job.
1) You can force the use of the serial build by setting
the environment variable H5PY_NEVER_USE_MPI.
2) You may import the serial build directly, using
import h5py._debian_h5py_serial as h5py
instead of
import h5py
Obviously python3-h5py-serial must be installed for any of these to be
effective.
If both H5PY_ALWAYS_USE_MPI and H5PY_NEVER_USE_MPI are set, then
H5PY_ALWAYS_USE_MPI takes precedence: h5py-mpi will be loaded.
It is possible to install python3-h5py-serial without installing
python3-h5py-mpi. In this case h5py-serial will be loaded in MPI
jobs as a fallback, and no further action or environment variable
needs to be set.
|