File: multiple_openmp.md

package info (click to toggle)
python-threadpoolctl 2.1.0-1~bpo10%2B1
  • links: PTS, VCS
  • area: main
  • in suites: buster-backports
  • size: 256 kB
  • sloc: python: 905; sh: 127; makefile: 4
file content (85 lines) | stat: -rw-r--r-- 3,788 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# Multiple OpenMP Runtimes

## Context

OpenMP is an API specification for parallel programming. There are many
implementations of it, tied to a compiler most of the time:

-  `libgomp` for GCC (GNU C/C++ Compiler),
-  `libomp` for Clang (LLVM C/C++ Compiler),
-  `libiomp` for ICC (Intel C/C++ Compiler),
-  `vcomp` for MSVC (Microsoft Visual Studio C/C++ Compiler).

In general, it is not advised to have different OpenMP runtime libraries (or
even different copies of the same library) loaded at the same time in a
program. It's considered an undefined behavior. Fortunately it is not as bad as
it sounds in most situations.

However this situation is frequent in the Python ecosystem since you can
install packages compiled with different compilers (hence linked to different
OpenMP implementations) and import them together in a Python program.

A typical example is installing NumPy from Anaconda which is linked against MKL
(Intel's math library) and another package that uses multi-threading with OpenMP
directly in a compiled extension, as is the case in Scikit-learn (via Cython
`prange`), LightGBM and XGBoost (via pragmas in the C++ source code).

From our experience, **most OpenMP libraries can seamlessly coexist in a same
program**. For instance, on Linux, we never observed any issue between
`libgomp` and `libiomp`, which is the most common mix (NumPy with MKL + a
package compiled with GCC, the most widely used C compiler on that platform).

## Incompatibility between Intel OpenMP and LLVM OpenMP under Linux

The only unrecoverable incompatibility we encountered happens when loading a
mix of compiled extensions linked with **`libomp` (LLVM/Clang) and `libiomp`
(ICC), on Linux**, manifested by crashes or deadlocks. It can happen even with
the simplest OpenMP calls like getting the maximum number of threads that will
be used in a subsequent parallel region. A possible explanation is that
`libomp` is actually a fork of `libiomp` causing name colliding for instance.
Using `threadpoolctl` may crash your program in such a setting.

**Fortunately this problem is very rare**: at the time of writing, all major
binary distributions of Python packages for Linux use either GCC or ICC to
build the Python scientific packages. Therefore this problem would only happen
if some packagers decide to start shipping Python packages built with
LLVM/Clang instead of GCC.

Surprisingly, we never encountered this kind of issue on macOS, where this mix
is the most frequent (Clang being the default C compiler on macOS).

## Workarounds for Intel OpenMP and LLVM OpenMP case

As far as we know, the only workaround consists in making sure only of one of
the two incompatible OpenMP libraries is loaded. For example:

- Tell MKL (used by NumPy) to use the GNU OpenMP runtime instead of the Intel
  OpenMP runtime by setting the following environment variable:

      export MKL_THREADING_LAYER=GNU

- Install a build of NumPy and SciPy linked against OpenBLAS instead of MKL.
  This can be done for instance by installing NumPy and SciPy from PyPI:

      pip install numpy scipy

  from the conda-forge conda channel:

      conda install -c conda-forge numpy scipy

  or from the default conda channel:

      conda install numpy scipy blas[build=openblas]

- Re-build your OpenMP-enabled extensions from source with GCC (or ICC) instead
  of Clang if you want to keep on using NumPy/SciPy linked against MKL with the
  default `libiomp`-based threading layer.

## References

The above incompatibility has been reported upstream to the LLVM and Intel
developers on the following public issue trackers/forums along with a minimal
reproducer written in C:

- https://bugs.llvm.org/show_bug.cgi?id=43565
- https://software.intel.com/en-us/forums/intel-c-compiler/topic/827607