File: setup.tex

package info (click to toggle)
spooles 2.2-16
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 19,760 kB
  • sloc: ansic: 146,836; sh: 7,571; csh: 3,615; makefile: 1,970; perl: 74
file content (416 lines) | stat: -rw-r--r-- 17,163 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
\chapter{Setting up the linear system}
\label{section:setup-linear-system}
\par
Our typical user is interested in solving $A X = Y$,
where $A$ is square, large and sparse,
and $X$ and $Y$ are dense matrices with one or more columns.
{\bf SPOOLES} is a very large sophisticated library with 
a commensurate learning curve to master its functionality.
But what is the bare minimum a user has to know to obtain a
solution to their linear system?
\begin{itemize}
\item
They need to construct an {\tt InpMtx} object that holds
the entries of $A$. ({\tt InpMtx} stands for 
{\tt Inp}ut {\tt m}a{\tt t}ri{\tt x},
for it is an easy to use object that one uses to input, 
assemble, sort and manipulate entries in a sparse matrix.)
\item
They need to construct a {\tt DenseMtx} object that holds
the entries of $Y$.
\item
They need to construct a {\tt DenseMtx} object to hold
the entries of $X$.
\end{itemize}
These two objects encapsulate the minimal interface to the 
{\bf SPOOLES} library.
the application program needs to know how to construct
the {\tt InpMtx} and {\tt DenseMtx} objects, either directly inside
an application program, or by reading in a custom matrix file.
This is what we now describe.
\par
\section{Constructing an {\tt InpMtx} object}
\label{subsection:construct-InpMtx}
\par
The {\tt InpMtx} object is more of an ``Input'' object 
than a ``Matrix'' object.
It descended from an out-of-core assembly code that assembled and
sorted entries of a sparse matrix.
Simplicity and functionality are its goals, at some expense of 
efficiency in storage and computation.
{\it Note: all indices are zero-based as in C, not 1-based as in
FORTRAN.}
\par
The {\tt InpMtx} object is simplest understood as a ``bag'' of
triples $\langle r(i,j),c(i,j),a_{i,j}\rangle$, 
where $r()$ and $c()$ are some
functions that define the first and second coordinates.
Each {\tt InpMtx} object has a ``coordinate type'', one of
\begin{itemize}
\item
{\tt INPMTX\_BY\_ROWS}, where $r(i,j) = i$, $c(i,j) = j$.
\item
{\tt INPMTX\_BY\_COLUMNS}, where $r(i,j) = j$, $c(i,j) = i$.
\item
{\tt INPMTX\_BY\_CHEVRONS}, 
where $r(i,j) = \min(i,j)$, $c(i,j) = j - i$.
\end{itemize}
Rows and columns are self-explanatory, the first coordinate
$r(i,j)$ is either the row or column of $a_{i,j}$.
The $j$-th ``chevron'' is composed of the diagonal entry $a_{j,j}$,
entries in the $j$-th row of the upper triangle,
and entries in the $j$-th column of the lower triangle.
It is the natural data structure for the assembly of the matrix
entries into the ``fronts'' used to factor the matrix.
\par
% ``Entries'' of the {\tt InpMtx} object can be one of three types.
The {\tt InpMtx} object can hold one of three types of entries as
``indices only'' (no entries are present),
real entries, or complex entries.
The type is specified by the {\tt inputMode}
parameter to the {\tt InpMtx\_init()} method.
\begin{itemize}
\item
{\tt INPMTX\_INDICES\_ONLY} where the triples
$langle r(i,j),c(i,j),-\rangle$ are really only pairs,
i.e., no numerical values are present.
This mode is useful for assembling graphs.
\item
{\tt SPOOLES\_REAL} where $a_{i,j}$ is a real number,
a {\tt double} value.
\item
{\tt SPOOLES\_COMPLEX} where $a_{i,j}$ is a complex number,
really two consecutive {\tt double} values.
\end{itemize}
``Coodinate type'' and ``input mode'' (equivalently, the type of
entries) are the two parameters that must be specified when
initializing an {\tt InpMtx} object.
\begin{verbatim}
InpMtx   *mtxA = InpMtx_new() ;
InpMtx_init(mtxA, coordType, inputMode, 0, 0) ;
\end{verbatim}
Every object in the {\bf SPOOLES} library is initialized
via an {\tt {\it ObjectName}\_new()} method, which allocates space
for the object and sets its fields to default values.
If you wish to use an {\it automatic} variable, then one must
explicitly set the default fields, as follows.
\begin{verbatim}
InpMtx   mtxA ;
InpMtx_setDefaultFields(&mtxA) ;
InpMtx_init(&mtxA, coordType, inputMode, 0, 0) ;
\end{verbatim}
Only the coordinate type and input mode are necessary.
The fourth and fifth arguments are upper bounds on the number of
entries and vectors for the object. (More on vectors in just a
moment.) The user does not need to know values for the number of
entries or vectors, for the object resizes itself as necessary 
as information is placed into it.
\par
``Vectors'' is one way that the entries can be stored.
There are actually three ways, specified by the 
{\tt storageMode} field of the {\tt InpMtx} object.
\begin{itemize}
\item
{\tt INPMTX\_RAW\_DATA}, where the pairs or triples are stored in
unordered form.
\item
{\tt INPMTX\_SORTED}, where the pairs or triples are stored in
ascending lexicographic order of the first two coordinates.
\item
{\tt INPMTX\_BY\_VECTORS}, where the pairs or triples are sorted
and stored in vectors defined by their first coordinate.
\end{itemize}
The storage mode can be changed via a call to
{\tt InpMtx\_changeStorageMode()}.
\par
The user does not really need to know about this 
``storage mode''. Fill the {\tt InpMtx} object with data in any way
at all (we will describe this shortly). 
The wrapper method will check that the data is in the form it needs. 
If is isn't, the object will be transformed as necessary.
The ``sort'' operation is really ``sort-and-compress'', the pairs
or triples are sorted into ascending order, and then the list is
scanned duplicates are ``merged'' together, i.e., if real or
complex entries are present, they are added together.
(This allows us to assemble finite element matrices.)
The knowledgeable user can change the storage mode as necessary,
and thus avoiding expensive sorts when possible.
For example, after reading in the matrix data from the CSAR-Nastran 
file,
the entries are already in sorted form, and the explicit sort
can be avoided.
\par
Now let us see how we ``input'' information into the {\tt InpMtx}
object.
There are several input methods, e.g., single entries, rows,
columns, and submatrices, and each input method has three types of
input, e.g, indices only, real entries, or complex entries.
Here are the prototypes below.
\begin{itemize}
\item Input methods for ``indices only'' mode.
\begin{verbatim}
void InpMtx_inputEntry ( InpMtx *mtxA, int row, int col ) ;
void InpMtx_inputRow ( InpMtx *mtxA, int row, int rowsize, int rowind[] ) ;
void InpMtx_inputColumn ( InpMtx *mtxA, int col, int colsize, int colind[] ) ;
void InpMtx_inputMatrix ( InpMtx *mtxA, int nrow, int ncol, int rowstride, 
                          int colstride, int rowind[], colind[] ) ;
\end{verbatim}
\item Input methods for real entries.
\begin{verbatim}
void InpMtx_inputRealEntry ( InpMtx *mtxA, int row, int col, double value ) ;
void InpMtx_inputRealRow ( InpMtx *mtxA, int row, int rowsize, 
                           int rowind[], double rowent[] ) ;
void InpMtx_inputRealColumn ( InpMtx *mtxA, int col, int colsize, 
                              int colind[], double colent[] ) ;
void InpMtx_inputRealMatrix ( InpMtx *mtxA, int nrow, int ncol, int rowstride, 
                    int colstride, int rowind[], colind[], double mtxent[] ) ;
\end{verbatim}
\item Input methods for complex entries.
\begin{verbatim}
void InpMtx_inputComplexEntry ( InpMtx *mtxA, int row, int col, 
                                double real, double imag ) ;
void InpMtx_inputComplexRow ( InpMtx *mtxA, int row, int rowsize, 
                               int rowind[], double rowent[] ) ;
void InpMtx_inputComplexColumn ( InpMtx *mtxA, int col, int colsize, 
                                 int colind[], double colent[] ) ;
void InpMtx_inputComplexMatrix ( InpMtx *mtxA, int nrow, int ncol, int rowstride, 
              int colstride, int rowind[], colind[], double mtxent[] ) ;
\end{verbatim}
\end{itemize}
The {\tt rowind[]} row indices and {\tt colind[]} column indices 
are precisely that. 
Don't worry about what coordinate type the {\tt InpMtx} object has,
the translation from row and column indices into the particular
coordinate is done inside the input methods.
\par
Let us look at a particular example, where we have a 
${\tt n1} \times {\tt n2}$ grid and we want to have a
$\left \lbrack \begin{array}{ccc}
   & -1 &    \\
-1 &  4 & -1 \\
   & -1 & 
\end{array} \right \rbrack$ 
5-point operator at each grid point.
Note, this matrix is symmetric, so we need input only the upper
triangle (or the lower triangle) of the matrix.
\begin{verbatim}
mtxA = InpMtx_new() ;
InpMtx_init(mtxA, INPMTX_BY_ROWS, SPOOLES_REAL, 0, 0) ;
for ( ii = 0 ; ii < n1 ; ii++ ) {
   for ( jj = 0 ; jj < n2 ; jj++ ) {
      ij = ii + jj*n1 ;
      indices[0] = ij ;
      entries[0] = 4.0 ;
      count = 1 ;
      if ( ii < n1 ) {
         indices[count] = ij + 1 ;
         entries[count] = -1.0 ;
         count++ ;
      }
      if ( jj < n2 ) {
         indices[count] = ij + n1 ;
         entries[count] = -1.0 ;
         count++ ;
      }
      InpMtx_inputRealRow(mtxA, ij, count, indices, entries) ;
   }
}
InpMtx_changeStorageMode(mtxA, INPMTX_BY_VECTORS) ;
\end{verbatim}
The process begins by allocating an {\tt InpMtx} object {\tt mtxA}
using the {\tt InpMtx\_new()} method,
initializing it with the {\tt InpMtx\_init()} method, 
and filling it with matrix entries with the {\tt
InpMtx\_inputRealRow()} method.
The last method, {\tt InpMtx\_changeStorageMode()},
``assembles'' the data (not really necessary because the entries
are disjoint, 
``sorts'' the data (again not necessary since the entries were
input in ascending order,
and creates a vector structure inside the {\tt InpMtx} object that
allows easy access to each individual row.
\par
We could have input all the entries and treated it as a
nonsymmetric matrix, but that would not be efficient with respect
to storage or factorization cost.
Alternatively, we could have input all the entries and called the 
{\tt InpMtx\_dropLowerTriangle()} method to drop the lower
triangular entries.
\par
\section{Constructing an {\tt DenseMtx} object}
\label{subsection:construct-DenseMtx}
\par
The {\tt DenseMtx} stores a real or complex dense matrix.
It is not just an array of numbers, it also has row indices and
column indices.
This allows it to exist in a distributed MPI environment where each
processors has only a submatrix of the matrix.
Here is how to initialize a {\tt DenseMtx} object.
\begin{verbatim}
int        type, rowid, colid, nrow, ncol, inc1, inc2 ;
DenseMtx   *mtx = DenseMtx_new() ;
DenseMtx_init(mtx, type, rowid, colid, nrow, ncol, inc1, inc2) ;
\end{verbatim}
\begin{itemize}
\item
The {\tt type} is either {\tt SPOOLES\_REAL} 
or {\tt SPOOLES\_COMPLEX}.
\item
The {\tt rowid} and {\tt colid} values are used to identify a 
{\tt DenseMtx} as a submatrix of a larger matrix.
Any values are suitable.
\item
{\tt nrow} and {\tt ncol} are the number of rows and columns in the
matrix, respectively.
\item
The entries of the matrix can be stored in either row major or
column major form.
For row major, use {\tt inc1 = ncol} and {\tt inc2 = 1}.
For column major, use {\tt inc1 = 1} and {\tt inc2 = nrow}.
Note, all solve and matrix-matrix multiply methods require that the
{\tt DenseMtx} object be column major.
\end{itemize}
For example, here is the call to initialize a {\tt DenseMtx} object 
to have real entries, 100 rows and 5 columns, entries column major.
\begin{verbatim}
DenseMtx_init(mtx, SPOOLES_REAL, 0, 0, 100, 5, 1, 100) ;
\end{verbatim}
During the initialization, 
the row indices are set to $0, 1, \dots, \mathtt{nrow - 1}$
and the column indices are set to $0, 1, \dots, \mathtt{ncol - 1}$.
The entries are {\bf not} initialized.
Zero the entries with a call to {\tt DenseMtx\_zero()}.
(This is crucial when loading a sparse right hand side into
the {\tt DenseMtx} object.)
\par
Once we have the {\tt DenseMtx} object initialized, we want to be
able to access the row indices, the column indices and the entries.
We do this through instance methods.
\begin{verbatim}
void DenseMtx_rowIndices ( DenseMtx *mtx, int *pnrow, int *prowind ) ;
void DenseMtx_columnIndices ( DenseMtx *mtx, int *pncol, int *pcolind ) ;
double * DenseMtx_entries ( DenseMtx *mtx ) ;
\end{verbatim}
We would use them as follows.
\begin{verbatim}
double   *entries ;
int      ncol, nrow, *colind, *rowind ;

DenseMtx_rowIndices(mtx, &nrow, &rowind) ;
DenseMtx_columnIndices(mtx, &ncol, &colind) ;
entries = DenseMtx_entries(mtx) ;
\end{verbatim}
We can now fill the indices or the entries.
The location of the {\tt (irow,jcol)} entry is found
at {\tt offset = irow*inc1 + jcol*inc2}.
The row and column increments can be found as follows.
\begin{verbatim}
int inc1 = DenseMtx_rowIncrement(mtx) ;
int inc2 = DenseMtx_columnIncrement(mtx) ;
\end{verbatim}
\par
To avoid dealing with row and column increments, 
we can retrieve and set values of a particular entry.
\begin{verbatim}
double   value, real, imag ;
int      irow, jcol ;

DenseMtx_realEntry(mtx, irow, jcol, &value) ;
DenseMtx_complexEntry(mtx, irow, jcol, &real, &imag) ;
DenseMtx_setRealEntry(mtx, irow, jcol, value + 10.) ;
DenseMtx_setComplexEntry(mtx, irow, jcol, real + 1., imag + 2.) ;
\end{verbatim}
As a real example, consider the ${\tt n1} \times {\tt n2}$ grid
from the previous subsection, where we assembled a finite
difference matrix. 
Assume that the right hand side is zero except for points where
{\tt (n1-1,0:n2-1)}, where a unit load is applied.
Here is the code to generate the {\tt DenseMtx} object.
\begin{verbatim}
mtxY = DenseMtx_new();
DenseMtx_init(mtxY, SPOOLES_REAL, 0, 0, n1*n2, 1, 1, n1*n2) ;
DenseMtx_zero(mtxY) ;
ii = n1 - 1 ;
for ( jj = 0 ; jj < n2 ; jj++ ) {
   ij = ii + jj*n1 ;
   DenseMtx_setRealEntry(mtxY, ij, 1, 1.0) ;
}
\end{verbatim}
Do not forget to zero the entries in {\tt mtxY} before setting any
entries.
\par
\section{IO for the {\tt InpMtx} and {\tt DenseMtx} objects}
\label{subsection:IO}
\par
The three driver programs that we describe in the next sections
read $A$ and $Y$ from files and write $X$ to a file. 
So the first thing we know is that the {\tt InpMtx} and {\tt
DenseMtx} objects can read and write themselves from and to files.
This convention is supported by most of the objects in the 
{\bf SPOOLES} library. 
In fact, there is a common {\it protocol} that is followed.
Let us take a look at the common IO methods for the {\tt InpMtx}.
\begin{itemize}
\item
{\tt int InpMtx\_readFromFile ( InpMtx *obj, char *filename ) ;}
\item
{\tt int InpMtx\_readFromFormattedFile ( InpMtx *obj, FILE *fp ) ;}
\item
{\tt int InpMtx\_readFromBinaryFile ( InpMtx *obj, FILE *fp ) ;}
\item
{\tt int InpMtx\_writeToFile ( InpMtx *obj, char *filename ) ;}
\item
{\tt int InpMtx\_writeToFormattedFile ( InpMtx *obj, FILE *fp ) ;}
\item
{\tt int InpMtx\_writeToBinaryFile ( InpMtx *obj, FILE *fp ) ;}
\item
{\tt int InpMtx\_writeForHumanEye ( InpMtx *obj, FILE *fp ) ;}
\end{itemize}
There are corresponding methods for the {\tt DenseMtx} object,
just replace ``{\tt Inp}'' by ``{\tt Dense}'' in the above prototypes.
\par
Two methods take as input {\tt char *} file names. Each object can
be archived in its own file with a particular suffix.
For example, {\tt InpMtx} objects can be read from and written to
files of the form {\tt *.inpmtxf} for a formatted file and
{\tt *.inpmtxb} for a binary file.
For a {\tt DenseMtx} object, the file names are
{\tt *.densemtxf} and {\tt *.densemtxb}.
The {\tt InpMtx\_readFromFile()} method looks at the {\tt filename}
argument, and calls the binary or formatted read methods, depending on
the suffix of {\tt filename}. 
A normal return code is {\tt 1}.
If the suffix does not match either {\tt *.inpmtxf} or {\tt *.inpmtxb},
an error message is printed and the return code is {\tt 0}.
Something similar works for writing an {\tt InpMtx} object to a
file using {\tt InpMtx\_writeToFile()}, except if {\tt filename}'s
suffix does not match, the {\tt InpMtx\_writeForHumanEye()} method
is called.
\par
Here are three approaches to link $A$ and $Y$ from an application
code to the {\tt InpMtx} and {\tt DenseMtx} objects demanded by the
{\bf SPOOLES} application.
\begin{itemize}
\item
An application could take the simple approach of creating an {\tt
InpMtx} and {\tt DenseMtx} object to hold $A$ and $Y$, write them
to a file, and then call a totally separate code that functions
much like our drivers, reading in $A$ and $Y$, computing $X$ and
writing $X$ to a file, which is then read in by the application code.
\item
A second approach, one that was taken during the first integration 
of the {\bf SPOOLES} library into CSAR-Nastran, was to have the
CSAR-Nastran code generate two files for $A$ and $Y$ in CSAR-Nastran 
format.
(This way CSAR-Nastran did not need to know any of the {\bf SPOOLES}
interface.)
Two custom routines were written to read in the entries of $A$ and 
$Y$ from the CSAR-Nastran files and construct {\tt InpMtx} and {\tt
DenseMtx} objects.
The wrapper routines we describe in the next three chapters were
called to solve for $X$ which was then written to a CSAR-Nastran file.
\item
A third approach would be to generate the {\tt InpMtx} and {\tt
DenseMtx} objects in the application program, and then call the
wrapper methods to solve for $X$, i.e., no IO would be necessary.
\end{itemize}