File: userguide_usage.rst

package info (click to toggle)
hipsolver 5.5.1-8
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 5,088 kB
  • sloc: cpp: 60,817; f90: 8,280; python: 567; sh: 485; xml: 199; ansic: 84; makefile: 43
file content (200 lines) | stat: -rw-r--r-- 11,367 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
.. _usage_label:

*****************
Using hipSOLVER
*****************

Once installed, hipSOLVER can be used just like any other library with a C API. The header file will need to be included
in the user code, and the shared library will become link-time and run-time dependencies for the user application. The
user code can be ported, with no changes, to any system with hipSOLVER installed regardless of the backend library.

For more details on how to use the API methods, see the code samples on
`hipSOLVER's github page <https://github.com/ROCmSoftwarePlatform/hipSOLVER/tree/develop/clients/samples>`_, or
the documentation of the corresponding backend libraries.

.. toctree::
   :maxdepth: 4

.. contents:: Table of contents
   :local:
   :backlinks: top


.. _porting:

Porting cuSOLVER applications to hipSOLVER
============================================

hipSOLVER is designed to make it easy for users of cuSOLVER to port their existing applications to hipSOLVER, and provides two
separate but interchangeable APIs in order to facilitate a two-stage transition process. Users are encouraged to start with hipSOLVER's
:ref:`compatibility API <library_compat>`, which uses the `hipsolverDn` prefix and has method signatures that are fully consistent with
cusolverDn functions. However, the compatibility API may introduce some performance drawbacks, especially when using
the rocSOLVER backend. So, as a second stage, it is recommended to begin the switch to hipSOLVER's :ref:`regular API <library_api>`,
which uses the `hipsolver` prefix and introduces minor adjustments to the API in order to get the best performance out of the rocSOLVER
backend. In most cases, switching to the regular API is as simple as removing `Dn` from the `hipsolverDn` prefix.
(No matter which API is used, a hipSOLVER application can be executed, without modifications to the code, in systems with cuSOLVER or
rocSOLVER installed. However, using the regular API ensures the best performance out of both backends).


.. _compat_api_differences:

Some considerations when using the compatibility hipSOLVER API
===============================================================

hipSOLVER's compatibility API is intended as a 1:1 translation of the cuSOLVER API, but not all functionality is equally supported in
rocSOLVER. Keep in mind the following considerations when using the compatibility API.


Different minimum array lengths
--------------------------------

- Currently, the following methods require larger arrays than the minimum required by cuSOLVER.

  * :ref:`hipsolverDnXgesvdaStridedBatched <compat_gesvda_strided_batched>` requires `U` to be of length `ldu * min(m,n)` at
    minimum, and `S` to be of length `min(m,n)` at minimum.


Arguments not referenced by rocSOLVER
--------------------------------------

- Unlike cuSOLVER, rocSOLVER functions do not provide information on invalid arguments in the `info` parameter, though they
  will provide info on singularities and algorithm convergence. Hence, when using the rocSOLVER backend, `info` will always
  return a value >= 0. In those cases where a rocSOLVER function does not accept `info` as an argument, hipSOLVER will
  set it to zero.

- The `niters` argument of :ref:`hipsolverDnXXgels <compat_gels>` and :ref:`hipsolverDnXXgesv <compat_gesv>` is not referenced
  by the rocSOLVER backend; there is no iterative refinement currently implemented in rocSOLVER.

- The `hRnrmF` argument of :ref:`hipsolverDnXgesvdaStridedBatched <compat_gesvda_strided_batched>` is not referenced by the
  rocSOLVER backend.


.. _porting_issues:

Possible performance implications of the compatibility API
------------------------------------------------------------

- To calculate the workspace required by function `gesvd` in rocSOLVER, the values of `jobu` and `jobv` are needed, however,
  the function :ref:`hipsolverDnXgesvd_bufferSize <compat_gesvd_bufferSize>` does not accept these arguments. So, when using
  the rocSOLVER backend, `hipsolverDnXgesvd_bufferSize` has to calculate internally the workspace for all possible values of `jobu` and `jobv`,
  and return the maximum.

  (`hipsolverDnXgesvd_bufferSize` is slower than `hipsolverXgesvd_bufferSize`, and its returned workspace size could be slightly larger than
  what is actually needed).

- To properly use a user-provided workspace, rocSOLVER requires both the allocated pointer and its size. However, the function
  :ref:`hipsolverDnXgetrf <compat_getrf>` does not accept `lwork` as an argument. In consequence, when using the rocSOLVER backend,
  `hipsolverDnXgetrf` has to call internally `hipsolverDnXgetrf_bufferSize` to know the size of the workspace.

  (`hipsolverDnXgetrf_bufferSize` will be called twice in practice, once by the user before allocating the workspace, and once
  by hipSOLVER internally when executing the `hipsolverDnXgetrf` function. `hipsolverDnXgetrf` could be slightly slower than `hipsolverXgetrf`
  because of the extra call to the bufferSize helper).

- The functions :ref:`hipsolverDnXgetrs <compat_getrs>`, :ref:`hipsolverDnXpotrs <compat_potrs>`, :ref:`hipsolverDnXpotrsBatched <compat_potrs_batched>`, and
  :ref:`hipsolverDnXpotrfBatched <compat_potrf_batched>` do not accept `work` and `lwork` as arguments. However, this functionality does require a non-zero workspace
  in rocSOLVER. As a result, when using the rocSOLVER backend, these functions will switch to the automatic workspace management model (see :ref:`here <mem_model>`).

  (Users must keep in mind that even if the compatibility API does not have bufferSize helpers for the mentioned functions, these functions do require
  workspace when using rocSOLVER, and it will be automatically managed. This may imply device memory reallocations with corresponding overheads).

- The function :ref:`hipsolverDnXgesvdaStridedBatched <compat_gesvda_strided_batched>` must apply a transpose operation to `V` in order to match the output of
  cuSOLVER, requiring an additional function call and extra workspace.


.. _api_differences:

Some considerations when using the regular hipSOLVER API
==========================================================

hipSOLVER's regular API is similar to cuSOLVER; however, due to differences in the implementation and design between
cuSOLVER and rocSOLVER, some minor adjustments were introduced to ensure the best performance out of both backends.


Different signatures and additional API methods
------------------------------------------------

- The methods to obtain the size of the workspace needed by functions `gels` and `gesv` in cuSOLVER require `dwork` as
  an argument; however, it is never used and can be null. On the rocSOLVER side, `dwork` is not needed to calculate the
  workspace size. In consequence:

  * :ref:`hipsolverXXgels_bufferSize <gels_bufferSize>` does not require `dwork` as an argument, and
  * :ref:`hipsolverXXgesv_bufferSize <gesv_bufferSize>` does not require `dwork` as an argument.

  (These wrappers pass `dwork = nullptr` when calling cuSOLVER).

- To calculate the workspace required by function `gesvd` in rocSOLVER, the values of `jobu` and `jobv` are needed. As a result,

  * :ref:`hipsolverXgesvd_bufferSize <gesvd_bufferSize>` requires `jobu` and `jobv` as arguments.

  (These arguments are ignored when the wrapper calls cuSOLVER, as they are not needed).

- To properly use a user-provided workspace, rocSOLVER requires both the allocated pointer and its size. Consequently:

  * :ref:`hipsolverXgetrf <getrf>` requires `lwork` as an argument.

  (`lwork` is ignored when the wrapper calls cuSOLVER, as it is not needed).

- All rocSOLVER functions called by hipSOLVER require a workspace. To allow the user to specify one,

  * :ref:`hipsolverXgetrs <getrs>` requires `work` and `lwork` as arguments,
  * :ref:`hipsolverXpotrfBatched <potrf_batched>` requires `work` and `lwork` as arguments,
  * :ref:`hipsolverXpotrs <potrs>` requires `work` and `lwork` as arguments, and
  * :ref:`hipsolverXpotrsBatched <potrs_batched>` requires `work` and `lwork` as arguments.

  (These arguments are ignored when these wrappers call cuSOLVER, as they are not needed).

  In order to support these changes, the regular API adds the following functions as well:

  * :ref:`hipsolverXgetrs_bufferSize <getrs_bufferSize>`
  * :ref:`hipsolverXpotrfBatched_bufferSize <potrf_batched_bufferSize>`
  * :ref:`hipsolverXpotrs_bufferSize <potrs_bufferSize>`
  * :ref:`hipsolverXpotrsBatched_bufferSize <potrs_batched_bufferSize>`

  (These methods return `lwork = 0` when using the cuSOLVER backend, as the corresponding functions
  in cuSOLVER do not need workspace).


Arguments not referenced by rocSOLVER
--------------------------------------

- Unlike cuSOLVER, rocSOLVER functions do not provide information on invalid arguments in the `info` parameter, though they
  will provide info on singularities and algorithm convergence. Hence, when using the rocSOLVER backend, `info` will always
  return a value >= 0. In those cases where a rocSOLVER function does not accept `info` as an argument, hipSOLVER will
  set it to zero.

- The `niters` argument of :ref:`hipsolverXXgels <gels>` and :ref:`hipsolverXXgesv <gesv>` is not referenced by the rocSOLVER
  backend; there is no iterative refinement currently implemented in rocSOLVER.


.. _mem_model:

Using rocSOLVER's memory model
---------------------------------

Most hipSOLVER functions take a workspace pointer and size as arguments, allowing the user to manage the device memory used
internally by the backends. rocSOLVER, however, can maintain the device workspace automatically by default
(see `rocSOLVER's memory model <https://rocsolver.readthedocs.io/en/master/userguide_memory.html>`_ for more details). In order to take
advantage of this feature, users may pass a null pointer for the `work` argument or a zero size for the `lwork` argument of any function
when using the rocSOLVER backend, and the workspace will be automatically managed behind-the-scenes. It is recommended, however, to use
a consistent strategy for workspace management, as performance issues may arise if the internal workspace is made to flip-flop between
user-provided and automatically allocated workspaces.

.. warning::
    This feature should not be used with the cuSOLVER backend; hipSOLVER does not guarantee a defined behavior when passing
    a null workspace to cuSOLVER functions that require one.


Using rocSOLVER's in-place functions
--------------------------------------

The solvers `gesv` and `gels` in cuSOLVER are out-of-place in the sense that the solution vectors `X` do not overwrite the
input matrix `B`. In rocSOLVER this is not the case; when `hipsolverXXgels` or `hipsolverXXgesv` call rocSOLVER, some data
movements must be done internally to restore `B` and copy the results back to `X`. These copies could introduce noticeable
overhead depending on the size of the matrices. To avoid this potential problem, users can pass `X = B` to `hipsolverXXgels`
or `hipsolverXXgesv` when using the rocSOLVER backend; in this case, no data movements will be required, and the solution
vectors can be retrieved using either `B` or `X`.

.. warning::
    This feature should not be used with the cuSOLVER backend; hipSOLVER does not guarantee a defined behavior when passing
    `X = B` to the mentioned functions in cuSOLVER.