1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464
|
/**
********************************************************************************
@page MAGMA-sparse Sparse Overview
The MAGMA-sparse Package
=================================
The MAGMA-sparse package in the MAGMA software stack contains
sparse BLAS routines as well as functions to handle the complete iterative
solution process of a sparse linear system of equations.
A typical application includes:
- an interface passing the linear system to MAGMA
- choosing the desired data structures for the respective sparse BLAS functions
- sending the data structures to the device
- choosing solver, eigensolver, and preconditioner
- solving the respective system on the device
- passing back the results
For each of these steps, multiple options are offered by the MAGMA software
stack.
To start, in C or C++, include the magma_v2.h and magmasparse.h headers.
#include <magma_v2.h>
#include <magmasparse.h>
Sparse Data Structures {#datastructures}
=================================
For a more generic programming approach, the sparse data structures
(matrices and vectors) are stored in data structures containing all information
necessary to access the sparse-BLAS via wrappers:
struct magma_z_matrix
{
magma_storage_t storage_type; // matrix format - CSR, ELL, SELL-P
magma_location_t memory_location; // CPU or DEV
magma_symmetry_t sym; // opt: indicate symmetry
magma_diagorder_t diagorder_type; // opt: only needed for factorization matrices
magma_uplo_t fill_mode; // fill mode full/lower/upper
magma_int_t num_rows; // number of rows
magma_int_t num_cols; // number of columns
magma_int_t nnz; // opt: number of nonzeros
magma_int_t max_nnz_row; // opt: max number of nonzeros in one row
magma_int_t diameter; // opt: max distance of entry from main diagonal
union {
magmaDoubleComplex *val; // array containing values in CPU case
magmaDoubleComplex_ptr dval; // array containing values in DEV case
};
union {
magmaDoubleComplex *diag; // opt: diagonal entries in CPU case
magmaDoubleComplex_ptr ddiag; // opt: diagonal entries in DEV case
};
union {
magma_index_t *row; // opt: row pointer CPU case
magmaIndex_ptr drow; // opt: row pointer DEV case
};
union {
magma_index_t *rowidx; // opt: array containing row indices CPU case
magmaIndex_ptr drowidx; // opt: array containing row indices DEV case
};
union {
magma_index_t *col; // opt: array containing col indices CPU case
magmaIndex_ptr dcol; // opt: array containing col indices DEV case
};
union {
magma_index_t *list; // opt: linked list pointing to next element
magmaIndex_ptr dlist; // opt: linked list pointing to next element
};
magma_index_t *blockinfo; // opt: for BCSR format CPU case
magma_int_t blocksize; // opt: info for SELL-P/BCSR
magma_int_t numblocks; // opt: info for SELL-P/BCSR
magma_int_t alignment; // opt: info for SELL-P/BCSR
magma_order_t major; // opt: row/col major for dense matrices
magma_int_t ld; // opt: leading dimension for dense
} magma_z_matrix;
The purpose of the unions (e.g. for the val array) is to support different
hardware platforms, where the magmaXXX_ptr is adapted to the respective
device characteristics.
For sparse matrices, the main formats are CSR, ELL and the MAGMA-specific SELL-P.
Generally, the sparse-BLAS routines provide for these the best performance.
Without specifying the storage type, the memory location or the dimension of the
matrix, the sparse matrix vector product can then be used via the wrapper:
magma_z_spmv(
magmaDoubleComplex alpha,
magma_z_matrix A,
magma_z_matrix x,
magmaDoubleComplex beta,
magma_z_matrix y,
magma_queue_t queue );
Sparse I/O {#io}
=================================
A sparse matrix stored in mtx format can be read from disk via the function:
magma_z_csr_mtx(
magma_z_matrix *A,
const char *filename,
magma_queue_t queue );
A sparse linear system of dimension mxn present in main memory can be
passed to MAGMA via the function:
magma_zcsrset(
magma_int_t m,
magma_int_t n,
magma_index_t *row,
magma_index_t *col,
magmaDoubleComplex *val,
magma_z_matrix *A,
magma_queue_t queue );
where row, col, val contain the matrix in CSR format.
If the matrix is already present in DEV memory, the corresponding function is
magma_zcsrset_gpu(
magma_int_t m,
magma_int_t n,
magmaIndex_ptr row,
magmaIndex_ptr col,
magmaDoubleComplex_ptr val,
magma_z_matrix *A,
magma_queue_t queue );
Similarly, matrices handled in MAGMA can be returned via the functions
magma_zcsrget(
magma_z_matrix A,
magma_int_t *m,
magma_int_t *n,
magma_index_t **row,
magma_index_t **col,
magmaDoubleComplex **val,
magma_queue_t queue );
magma_zcsrget_gpu(
magma_z_matrix A,
magma_int_t *m,
magma_int_t *n,
magmaIndex_ptr *row,
magmaIndex_ptr *col,
magmaDoubleComplex_ptr *val,
magma_queue_t queue );
respectively
write_z_csrtomtx(
magma_z_matrix A,
const char *filename,
magma_queue_t queue );
Additionally, MAGMA contains routines to generate stencil discretizations
of different kind.
Vectors are handled as dense matrices (Magma_DENSE) and
can be initialized inside MAGMA via
magma_z_vinit(
magma_z_matrix *x,
magma_location_t memory_location,
magma_int_t num_rows,
magma_int_t num_cols,
magmaDoubleComplex values,
magma_queue_t queue );
where ''memory_location'' sets the location (Magma_CPU or Magma_DEV).
Also, vectors can be read from file via
magma_z_vread(
magma_z_matrix *x,
magma_int_t length,
char * filename,
magma_queue_t queue );
or - in case of a block of sparse vectors stored as CSR matrix - via
magma_z_vspread(
magma_z_matrix *x,
const char * filename,
magma_queue_t queue );
or passed from/to main memory:
magma_zvset(
magma_int_t m,
magma_int_t n,
magmaDoubleComplex *val,
magma_z_matrix *b,
magma_queue_t queue );
magma_zvget(
magma_z_matrix b
magma_int_t *m,
magma_int_t *n,
magmaDoubleComplex **val,
magma_queue_t queue );
Matrix Formats {#formats}
=================================
To convert a matrix from one into another format, the CPU-based routine
magma_z_mconvert(
magma_z_matrix A,
magma_z_matrix *B,
magma_storage_t old_format,
magma_storage_t new_format,
magma_queue_t queue );
can be used where old_format and new_format determine the specific conversion.
Memory Handling {#memorhandling}
=================================
All iterative solvers and eigensolvers included in the MAGMA-sparse package
work on the device. Hence, it is required to send the respective data structures
to the device for solving, and back to access the solution. The functions
magma_z_mtransfer(
magma_z_matrix A,
magma_z_matrix *B,
magma_location_t src,
magma_location_t dst,
magma_queue_t queue );
magma_z_vtransfer(
magma_z_matrix x,
magma_z_matrix *y,
magma_location_t src,
magma_location_t dst,
magma_queue_t queue );
allow any data copy operation - from host to device, device to device,
device to host or host to host.
Linear algebra objects can be deallocated via
magma_z_mfree(
magma_z_matrix *A,
magma_queue_t queue );
Iterative Solvers {#sparsesolvers}
=================================
The MAGMA-sparse package contains a variety of linear solvers, eigensolvers,
and preconditioners. The standard procedure to call a solver is to
pass the linear algebra objects (located on the device) and a structure called
magma_z_solver_par (respectively and magma_z_precond_par)
controlling the iterative solver and collecting information during the execution:
struct magma_z_solver_par
{
magma_solver_type solver; // solver type
magma_int_t version; // sometimes there are different versions
double atol; // absolute residual stopping criterion
double rtol; // relative residual stopping criterion
magma_int_t maxiter; // upper iteration limit
magma_int_t restart; // for GMRES
magma_ortho_t ortho; // for GMRES
magma_int_t numiter; // feedback: number of needed iterations
double init_res; // feedback: initial residual
double final_res; // feedback: final residual
double iter_res; // feedback: iteratively computed residual
real_Double_t runtime; // feedback: runtime needed
real_Double_t *res_vec; // feedback: array containing residuals
real_Double_t *timing; // feedback: detailed timing
magma_int_t verbose; // print residual ever 'verbose' iterations
magma_int_t num_eigenvalues; // number of EV for eigensolvers
magma_int_t ev_length; // needed for framework
double *eigenvalues; // feedback: array containing eigenvalues
magmaDoubleComplex_ptr eigenvectors; // feedback: array containing eigenvectors on DEV
magma_int_t info; // feedback: did the solver converge etc.
//---------------------------------
// the input for verbose is:
// 0 = production mode
// k>0 = convergence and timing is monitored in *res_vec and *timeing every
// k-th iteration
//
// the output of info is:
// 0 = convergence (stopping criterion met)
// -1 = no convergence
// -2 = convergence but stopping criterion not met within maxiter
//--------------------------------
} magma_z_solver_par;
These entities can either be initialized manually, or via the function
magma_zsolverinfo_init(
magma_z_solver_par *solver_par,
magma_z_preconditioner *precond_par,
magma_queue_t queue );
setting them to some default values.
For eigensolvers, the workspace needed for the eigenvectors has to be allocated
consistent with the matrix dimension, which requires additionally calling
magma_zeigensolverinfo_init(
magma_z_solver_par *solver_par,
magma_queue_t queue );
after setting solver_par.ev_length and solver_par.num_eigenvalues to the
correct numbers.
For the preconditioner configuration, a similar structure is used:
typedef struct magma_z_preconditioner
{
magma_solver_type solver;
magma_solver_type trisolver;
magma_int_t levels;
magma_int_t sweeps;
magma_int_t pattern;
magma_int_t bsize;
magma_int_t offset;
magma_precision format;
double atol;
double rtol;
magma_int_t maxiter;
magma_int_t restart;
magma_int_t numiter;
magma_int_t spmv_count;
double init_res;
double final_res;
real_Double_t runtime; // feedback: preconditioner runtime
real_Double_t setuptime; // feedback: preconditioner setup time
magma_z_matrix M;
magma_z_matrix L;
magma_z_matrix LT;
magma_z_matrix U;
magma_z_matrix UT;
magma_z_matrix LD;
magma_z_matrix UD;
magma_z_matrix d;
magma_z_matrix d2;
magma_z_matrix work1;
magma_z_matrix work2;
magma_int_t* int_array_1;
magma_int_t* int_array_2;
cusparseSolveAnalysisInfo_t cuinfo; // for cuSPARSE ILU
cusparseSolveAnalysisInfo_t cuinfoL; // for cuSPARSE ILU
cusparseSolveAnalysisInfo_t cuinfoLT; // for cuSPARSE ILU
cusparseSolveAnalysisInfo_t cuinfoU; // for cuSPARSE ILU
cusparseSolveAnalysisInfo_t cuinfoUT; // for cuSPARSE ILU
} magma_z_preconditioner;
An easy way to access the data collected during a solver execution is given
by the function
magma_zsolverinfo(
magma_z_solver_par *solver_par,
magma_z_preconditioner *precond_par,
magma_queue_t queue );
After completion,
magma_zsolverinfo_free(
magma_z_solver_par *solver_par,
magma_z_preconditioner *precond,
magma_queue_t queue );
deallocates all memory used within the solver and preconditioner structure.
A solver can be called via the wrapper
magma_z_solver(
magma_z_matrix A, magma_z_matrix b,
magma_z_matrix *x, magma_zopts *zopts,
magma_queue_t queue );
where zopts is a structure containing both, the solver and the preconditioner
information:
struct magma_zopts{
magma_z_solver_par solver_par;
magma_z_preconditioner precond_par;
magma_storage_t input_format;
int blocksize;
int alignment;
magma_storage_t output_format;
magma_location_t input_location;
magma_location_t output_location;
magma_scale_t scaling;
}magma_zopts;
All entities of this structure can be initialized from command line by calling
magma_zparse_opts(
int argc,
char** argv,
magma_zopts *opts,
int *matrices,
magma_queue_t queue );
Example {#sparseexample}
=================================
Especially when using MAGMA-sparse for the first time, the easiest way to get
familiar with the package is to use and modify one of the predefined testers.
In the following example we assume to have an application coded in C and
running in double precision that at some instance requires solving a linear
system of the form Ax=b, where A and b are generated within the application.
Furthermore, we assume that the matrix A is in CSR format and stored in
row/col/val, and the vector b is stored in valb. These entities are passed to
MAGMA-sparse. In the end, we want the solution passed back to the application in
the vector valx.
// initialize MAGMA-sparse and create some LA objects
magma_dopts dopts;
magma_queue_t queue;
magma_queue_create( 0, &queue );
magma_d_matrix A, A_d, x, x_d, b, b_d;
// pass linear system to MAGMA-sparse
magma_dcsrset( m, n, row, col, val, &A, queue );
magma_dvset( m, n, valb, &b, queue );
// copy the linear system to the device
magma_d_vtransfer( b, &b_d, Magma_CPU, Magma_DEV, queue );
magma_d_mtransfer( A, &A_d, Magma_CPU, Magma_DEV, queue );
// allocate solution vector - on device
magma_d_vinit( &x_d, Magma_DEV, A.num_cols, 1, one, queue );
// configure solver
dopts.solver_par.solver = Magma_PGMRES; dopts.solver_par.restart = 30;
dopts.solver_par.rtol = 1e-10;
magma_dsolverinfo_init( &dopts.solver_par, &dopts.precond_par, queue );
// configure the preconditioner
dopts.precond_par.solver = Magma_ILU; dopts.precond_par.levels = 0;
dopts.precond_par.trisolver = Magma_CUSOLVE;
magma_d_precondsetup( A, b, &dopts.solver_par, &dopts.precond_par, queue );
// solve the linear system
magma_d_solver( A_d, b_d, &x_d, &dopts, queue );
// copy the solution vector back to the host and pass it back to the application
magma_d_mtransfer( x_d, &x, Magma_DEV, Magma_CPU, queue );
magma_dvget( x, &m, &n, &valx, queue );
// clean up the memory
magma_dsolverinfo_free( &dopts.solver_par, &dopts.precond_par, queue );
magma_d_mfree(&A_d, queue );
magma_d_mfree(&A, queue );
magma_d_mfree(&x_d, queue );
magma_d_mfree(&b_d, queue );
magma_d_mfree(&b, queue );
// finalize MAGMA
magma_queue_destroy( queue );
magma_finalize();
*/
|