Package: rocblas / 5.5.1+dfsg-7
Metadata
Package | Version | Patches format |
---|---|---|
rocblas | 5.5.1+dfsg-7 | 3.0 (quilt) |
Patch series
view the series filePatch | File delta | Description |
---|---|---|
0001 use generic blas for reference.patch | (download) |
clients/CMakeLists.txt |
22 2 + 20 - 0 ! |
use generic blas for reference The upstream project typically uses either the AOCL BLIS library or the Netlib BLAS library as the reference implementation in the test suite on Linux. However, the OpenBLAS library is used by upstream on Windows. It would be nice to use OpenBLAS on Debian for performance reasons (as the test suite is heavily CPU-bound), however, the Netlib implementation seems to be more reliable for achieving a full suite of passing tests. |
0002 remove use of pip and virtualenv.patch | (download) |
CMakeLists.txt |
27 2 + 25 - 0 ! |
remove use of pip and virtualenv The upstream project creates a virtualenv and uses pip to install the Python dependencies during a build. In the Debian build, all the Python dependencies are already provided by packages, so there's no need for all that complexity. When contributed upstream, this functionality was guarded behind the cmake option -DBUILD_WITH_PIP=OFF. Tensile_ROOT can also be passed from d/rules (if necessary) so this patch can be dropped with ROCm 5.7. |
0003 use local mathjax.patch | (download) |
docs/source/conf.py |
3 3 + 0 - 0 ! |
use local mathjax The sphinx.ext.mathjax extension defaults to loading mathjax from a CDN, which results in the lintian warning 'privacy-breach-generic'. Use a local copy of mathjax to prevent that problem. |
0004 mark known bugs.patch | (download) |
clients/gtest/known_bugs.yaml |
2 2 + 0 - 0 ! |
mark known bugs In ROCm 5.5, the FP16 High-Precision Accumulate checks are also offset from the correct answer by margins slightly greater than those allowed. |
0005 make openmp optional.patch | (download) |
clients/common/blis_interface.cpp |
2 2 + 0 - 0 ! |
make openmp optional |
0006 move tensile library into versioned subdir.patch | (download) |
library/src/tensile_host.cpp |
10 9 + 1 - 0 ! |
move tensile library into versioned subdir The Tensile library contains optimized kernels that are loaded at runtime by rocblas, and thus must be a part of the library package. |
0007 remove references to dfsg violating kernels.patch | (download) |
library/src/blas3/Tensile/Logic/asm_full/aldebaran/aldebaran_Cijk_Ailk_Bjlk_DB.yaml |
169126 0 + 169126 - 0 ! |
remove references to dfsg-violating kernels The DGEMM_Aldebaran_PKFixedAtomic512Latest and DGEMM_Aldebaran_PKFixedAtomic512_104 kernels were removed for dfsg reasons, and references to those kernels must be removed to fix the build. This will result in a performance drop on MI200 GPUs because the tuned assembly kernels will be replaced with fallback implementations for these problems. This problem has been reported upstream and they intend to supply a better fix. |
0008 ensure replacementkernels cov3 dir exists.patch | (download) |
tensile/Tensile/ReplacementKernels-cov3/README.txt |
1 1 + 0 - 0 ! |
ensure replacementkernels-cov3 dir exists All files in this directory were removed for dfsg violations, but the directory itself is still required for the build. |
0009 hide kernel symbols.patch | (download) |
library/src/blas2/rocblas_trsv_kernels.cpp |
2 1 + 1 - 0 ! |
hide kernel symbols If not marked as static, the rocblas kernels would be weak public symbols. They are not intended to be visible, but are not affected by -fvisiblity=hidden and cannot be entirely hidden except by being marked as static. Applied-Uptream: https://github.com/ROCmSoftwarePlatform/rocBLAS/commit/c311f3ce684368091acae744c924bcddea4add33 |
0010 fix sample includes.patch | (download) |
clients/samples/example_c_dgeam.c |
2 1 + 1 - 0 ! |
fix sample includes |
0011 disable stdc extension in header.patch | (download) |
library/include/internal/rocblas-types.h |
3 0 + 3 - 0 ! |
disable stdc extension in header The request for any STDC extension should not be controlled by a header or else the behaviour of the program will change depending on the order of the includes. This define is being removed upstream in ROCm 6.0. Bug: https://github.com/ROCmSoftwarePlatform/rocBLAS/issues/1301 |
0012 expand isa compatibility.patch | (download) |
library/src/handle.cpp |
26 26 + 0 - 0 ! |
expand isa compatibility This is not an ideal solution, but there are a number of ISAs that are subsets of gfx900, gfx1010 and gfx1030. The simplest way to get rocBLAS and Tensile to load the compatible kernels when running on architectures compatible with those ISAs is to simply report the GPU as being of the supported type. There is no way this patch would be accepted upstream as it is expected that they will implement a better solution... eventually. |
0013 disable rotg nan check.patch | (download) |
clients/gtest/blas1_gtest.yaml |
18 15 + 3 - 0 ! |
[patch] refactor rotg_test code (#1632) * refactor rotg_test code * for rotg use alpha and beta in place of rotga and rotgb * correct rotg initialization Bug: https://github.com/ROCmSoftwarePlatform/rocBLAS/issues/1287 |
0014 spellcheck.patch | (download) |
clients/include/rocblas_common.yaml |
2 1 + 1 - 0 ! |
spellcheck All fixes have been forwarded. Some were also fixed upstream in https://github.com/ROCmSoftwarePlatform/rocBLAS/commit/53c8ce8d3eb2eee9c7ca6711522efbf882de1646 |
0015 move rocsblas test data to share.patch | (download) |
clients/gtest/rocblas_gtest_main.cpp |
23 22 + 1 - 0 ! |
move rocsblas test data to share The rocblas_test.data file is a binary file containing arguments to test with rocblas functions (e.g., various combinations of matrix sizes and other similar options). It is created by rocblas_gentest.py and is architecture-independent (as it always uses network byte order). |
0016 disable replacement kernels.patch | (download) |
library/src/blas3/Tensile/Logic/archive/vega20_Cijk_Ailk_Bjlk_DB.yaml |
172 86 + 86 - 0 ! |
disable replacement kernels The replacement kernels were removed for DFSG reasons. Attempting to use them in rocBLAS anyway will cause non-deterministic errors at build-time and run-time. The upstream project is committed to eliminating the closed-source kernels, so this should be a non-issue in the near future. Bug-Debian: http://bugs.debian.org/1042036 |
0017 print kernel name for missing attribute error.patch | (download) |
tensile/Tensile/KernelWriter.py |
2 2 + 0 - 0 ! |
print kernel name for missing attribute error |
0018 verbose tensile source kernel build.patch | (download) |
tensile/Tensile/TensileCreateLibrary.py |
9 2 + 7 - 0 ! |
verbose tensile source kernel build The build of the Tensile source kernels takes quite a long time, so it may time out on slower machines if there is no output in too long. The verbose flag should add some output at the start of the build for each offload architecture, which should help prevent timeout. |
0019 remove x86 intrinsics.patch | (download) |
clients/include/rocblas_math.hpp |
1 0 + 1 - 0 ! |
remove x86 intrinsics The x86 intrinsics don't seem to be used. |
0020 msgpack names.patch | (download) |
tensile/Tensile/Source/lib/CMakeLists.txt |
2 1 + 1 - 0 ! |
[patch] fix for newer windows vcpkg msgpack (#1827) |
0021 msgpack cxx support.patch | (download) |
tensile/Tensile/Source/lib/CMakeLists.txt |
4 3 + 1 - 0 ! |
[patch] another vcpkg version package name fix (#1836) * more vcpkg package options |
0022 reserved identifiers.patch | (download) |
library/include/internal/rocblas-auxiliary.h |
8 4 + 4 - 0 ! |
[patch] fix reserved identifiers in include guards (#1600) The include guards have been changed to the filename in uppercase letters with all non-alphanumeric symbols replaced by underscore. This include guard pattern matches the guard that is used for the generated file rocsparse-export.h. The C and C++ standards reserve all identifiers that begin with an underscore followed by a capital letter [C99 7.1.3] [C++11 17.6.4.3.2]. |
0023 remove mf16c flag.patch | (download) |
clients/benchmarks/CMakeLists.txt |
4 0 + 4 - 0 ! |
[patch] remove mf16c flag as f16 intrinsics _cvtss_sh, _cvtsh_ss no longer used Bug: https://github.com/ROCm/rocBLAS/issues/1422 Bug-Debian: https://bugs.debian.org/1075724 |
0024 use xnack specialized assembly kernels with gfx90a.patch | (download) |
CMakeLists.txt |
5 4 + 1 - 0 ! |
use xnack-specialized assembly kernels with gfx90a This change passes the xnack-specialized targets gfx90a:xnack- and gfx90a:xnack+ for the Tensile architectures when rocBLAS is built for the non-specialized gfx90a target. This helps to reduce the library binary size without affecting the assembly kernels in Tensile. |
0025 spelling.patch | (download) |
clients/benchmarks/client.cpp |
2 1 + 1 - 0 ! |
fix spelling |