File: drivers.tex

package info (click to toggle)
spooles 2.2-16
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 19,760 kB
  • sloc: ansic: 146,836; sh: 7,571; csh: 3,615; makefile: 1,970; perl: 74
file content (440 lines) | stat: -rw-r--r-- 15,358 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
\par
\section{Driver programs for the multithreaded functions}
\label{section:MT:drivers}
\par
%=======================================================================
\begin{enumerate}
%-----------------------------------------------------------------------
\item
\begin{verbatim}
allInOneMT msglvl msgFile type symmetryflag pivotingflag 
           matrixFileName rhsFileName seed nthread
\end{verbatim}
This driver program reads in a matrix $A$ and right hand side $B$,
generates the graph for $A$ and orders the matrix,
factors $A$ and solves the linear system $AX = B$ for $X$
using multithreaded factors and solves.
Use the script file {\tt do\_gridMT} for testing.
\par
\begin{itemize}
\item
The {\tt msglvl} parameter determines the amount of output.
Use {\tt msglvl = 1} for just timing output.
\item
The {\tt msgFile} parameter determines the message file --- if {\tt
msgFile} is {\tt stdout}, then the message file is {\it stdout},
otherwise a file is opened with {\it append} status to receive any
output data.
\item
The {\tt type} parameter specifies a real or complex linear system.
\begin{itemize}
\item
{\tt type = 1 (SPOOLES\_REAL)} for real,
\item
{\tt type = 2 (SPOOLES\_COMPLEX)} for complex.
\end{itemize}
\item
The {\tt symmetryflag} parameter specifies the symmetry of the matrix.
\begin{itemize}
\item
{\tt type = 0 (SPOOLES\_SYMMETRIC)} for $A$ real or complex symmetric,
\item
{\tt type = 1 (SPOOLES\_HERMITIAN)} for $A$ complex Hermitian,
\item
{\tt type = 2 (SPOOLES\_NONSYMMETRIC)}
\end{itemize}
for $A$ real or complex nonsymmetric.
\item
The {\tt pivotingflag} parameter signals whether pivoting for
stability will be enabled or not.
\begin{itemize}
\item
If {\tt pivotingflag = 0 (SPOOLES\_NO\_PIVOTING)},
no pivoting will be done.
\item
If {\tt pivotingflag = 1 (SPOOLES\_PIVOTING)},
pivoting will be done to ensure that all
entries in $U$ and $L$ have magnitude less than {\tt tau}.
\end{itemize}
\item
The {\tt matrixFileName} parameter is the name of the files where
the matrix entries are read from.
The file has the following structure.
\begin{verbatim}
neqns neqns nent
irow jcol entry
...  ...  ...
\end{verbatim}
where {\tt neqns} is the global number of equations and {\tt nent}
is the number of entries in this file.
There follows {\tt nent} lines, each containing a row index, a
column index and one or two floating point numbers, one if real,
two if complex.
\item
The {\tt rhsFileName} parameter is the name of the files where
the right hand side entries are read from.
The file has the following structure.
\begin{verbatim}
nrow nrhs
irow entry ... entry
...  ...   ... ...
\end{verbatim}
where {\tt nrow} is the number of rows in this file
and {\tt nrhs} is the number of rigght and sides.
There follows {\tt nrow} lines, each containing a row index
and either {\tt nrhs} or {\tt 2*nrhs} floating point numbers, 
the first if real, the second if complex.
\item
The {\tt seed} parameter is a random number seed.
\item
The {\tt nthread} parameter is the number of threads.
\end{itemize}
%-----------------------------------------------------------------------
\item
\begin{verbatim}
patchAndGoMT msglvl msgFile type symmetryflag patchAndGoFlag fudge toosmall 
             storeids storevalues matrixFileName rhsFileName seed nthread
\end{verbatim}
This driver program is used to test the ``patch-and-go''
functionality for a factorization without pivoting.
When small diagonal pivot elements are found, 
one of three actions are taken.
See the {\tt PatchAndGoInfo} object for more information.
\par
The program reads in a matrix $A$ and right hand side $B$,
generates the graph for $A$ and orders the matrix,
factors $A$ and solves the linear system $AX = B$ for $X$
using multithreaded factors and solves.
Use the script file {\tt do\_patchAndGo} for testing.
\par
\begin{itemize}
\item
The {\tt msglvl} parameter determines the amount of output.
Use {\tt msglvl = 1} for just timing output.
\item
The {\tt msgFile} parameter determines the message file --- if {\tt
msgFile} is {\tt stdout}, then the message file is {\it stdout},
otherwise a file is opened with {\it append} status to receive any
output data.
\item
The {\tt type} parameter specifies a real or complex linear system.
\begin{itemize}
\item
{\tt type = 1 (SPOOLES\_REAL)} for real,
\item
{\tt type = 2 (SPOOLES\_COMPLEX)} for complex.
\end{itemize}
\item
The {\tt symmetryflag} parameter specifies the symmetry of the matrix.
\begin{itemize}
\item
{\tt type = 0 (SPOOLES\_SYMMETRIC)} for $A$ real or complex symmetric,
\item
{\tt type = 1 (SPOOLES\_HERMITIAN)} for $A$ complex Hermitian,
\item
{\tt type = 2 (SPOOLES\_NONSYMMETRIC)}
\end{itemize}
for $A$ real or complex nonsymmetric.
\item
The {\tt patchAndGoFlag} specifies the ``patch-and-go'' strategy.
\begin{itemize}
\item
{\tt patchAndGoFlag = 0} --- if a zero pivot is detected, stop
computing the factorization, set the error flag and return.
\item
{\tt patchAndGoFlag = 1} --- if a small or zero pivot is detected, 
set the diagonal entry to 1 and the offdiagonal entries to zero.
\item
{\tt patchAndGoFlag = 2} --- if a small or zero pivot is detected, 
perturb the diagonal entry.
\end{itemize}
\item
The {\tt fudge} parameter is used to perturb a diagonal entry.
\item
The {\tt toosmall} parameter is judge when a diagonal entry is small. 
\item
If {\tt storeids = 1}, then the locations where action was taken is
stored in an {\tt IV} object.
\item
If {\tt storevalues = 1}, then the perturbations are
stored in an {\tt DV} object.
\item
The {\tt matrixFileName} parameter is the name of the files where
the matrix entries are read from.
The file has the following structure.
\begin{verbatim}
neqns neqns nent
irow jcol entry
...  ...  ...
\end{verbatim}
where {\tt neqns} is the global number of equations and {\tt nent}
is the number of entries in this file.
There follows {\tt nent} lines, each containing a row index, a
column index and one or two floating point numbers, one if real,
two if complex.
\item
The {\tt rhsFileName} parameter is the name of the files where
the right hand side entries are read from.
The file has the following structure.
\begin{verbatim}
nrow nrhs
irow entry ... entry
...  ...   ... ...
\end{verbatim}
where {\tt nrow} is the number of rows in this file
and {\tt nrhs} is the number of rigght and sides.
There follows {\tt nrow} lines, each containing a row index
and either {\tt nrhs} or {\tt 2*nrhs} floating point numbers, 
the first if real, the second if complex.
\item
The {\tt seed} parameter is a random number seed.
\item
The {\tt nthread} parameter is the number of threads.
\end{itemize}
%-----------------------------------------------------------------------
\item
\begin{verbatim}
testMMM msglvl msgFile dataType symflag storageMode transpose
        nrow ncol nitem nrhs seed alphaReal alphaImag nthread
\end{verbatim}
This driver program generates $A$, a 
${\tt nrow} \times {\tt ncol}$
matrix using {\tt nitem} input entries, $X$ and $Y$,
${\tt nrow} \times {\tt nrhs}$ matrices,
is filled with random numbers.
It then computes $Y + \alpha*A*X$, $Y + \alpha*A^T*X$ or
$Y + \alpha*A^H*X$.
The program's output is a file which when sent into Matlab,
outputs the error in the computation.
\par
\begin{itemize}
\item
The {\tt msglvl} parameter determines the amount of output ---
taking {\tt msglvl >= 3} means the {\tt InpMtx} object is written
to the message file.
\item
The {\tt msgFile} parameter determines the message file --- if {\tt
msgFile} is {\tt stdout}, then the message file is {\it stdout},
otherwise a file is opened with {\it append} status to receive any
output data.
\item
{\tt dataType} is the type of entries,
{\tt 0} for real, {\tt 1} for complex.
\item
{\tt symflag} is the symmetry flag, {\tt 0} for symmetric, 
{\tt 1} for Hermitian, {\tt 2} for nonsymmetric.
\item
{\tt storageMode} is the storage mode for the entries, 
{\tt 1} for by rows, {\tt 2} for by columns, 
{\tt 3} for by chevrons. 
\item
{\tt transpose} determines the equation,
{\tt 0} for $Y + \alpha*A*X$, 
{\tt 1} for $Y + \alpha*A^T*X$ or
{\tt 2} for $Y + \alpha*A^H*X$.
\item
{\tt nrowA} is the number of rows in $A$
\item
{\tt ncolA} is the number of columns in $A$
\item
{\tt nitem} is the number of matrix entries that are 
assembled into the matrix.
\item
{\tt nrhs} is the number of columns in $X$ and $Y$.
\item
The {\tt seed} parameter is a random number seed used to fill the
matrix entries with random numbers.
\item
{\tt alphaReal} and {\tt alphaImag} form the scalar in the multiply.
\item
{\tt nthread} is the number of threads to use.
\end{itemize}
%-----------------------------------------------------------------------
\item
\begin{verbatim}
testGridMT msglvl msgFile n1 n2 n3 maxzeros maxsize seed type
         symmetryflag sparsityflag pivotingflag tau droptol
         nrhs nthread maptype cutoff lookahead
\end{verbatim}
This driver program tests the serial {\tt FrontMtx\_MT\_factor()}
and {\tt FrontMtx\_MT\_solve()} methods for the linear system $AX = B$.
The factorization and solve are done in parallel.
Use the script file {\tt do\_gridMT} for testing.
\par
\begin{itemize}
\item
The {\tt msglvl} parameter determines the amount of output.
Use {\tt msglvl = 1} for just timing output.
\item
The {\tt msgFile} parameter determines the message file --- if {\tt
msgFile} is {\tt stdout}, then the message file is {\it stdout},
otherwise a file is opened with {\it append} status to receive any
output data.
\item
{\tt n1} is the number of points in the first grid direction.
\item
{\tt n2} is the number of points in the second grid direction.
\item
{\tt n3} is the number of points in the third grid direction.
\item
{\tt maxzeros} is used to merge small fronts together into larger
fronts.
Look at the {\tt ETree} object for
the {\tt ETree\_mergeFronts\{One,All,Any\}()} methods.
\item
{\tt maxsize} is used to split large fronts into smaller
fronts.
See the {\tt ETree\_splitFronts()} method.
\item
The {\tt seed} parameter is a random number seed.
\item
The {\tt type} parameter specifies a real or complex linear system.
\begin{itemize}
\item
{\tt type = 1 (SPOOLES\_REAL)} for real,
\item
{\tt type = 2 (SPOOLES\_COMPLEX)} for complex.
\end{itemize}
\item
The {\tt symmetryflag} parameter specifies the symmetry of the matrix.
\begin{itemize}
\item
{\tt type = 0 (SPOOLES\_SYMMETRIC)} for $A$ real or complex symmetric,
\item
{\tt type = 1 (SPOOLES\_HERMITIAN)} for $A$ complex Hermitian,
\item
{\tt type = 2 (SPOOLES\_NONSYMMETRIC)}
\end{itemize}
for $A$ real or complex nonsymmetric.
\item
The {\tt sparsityflag} parameter signals a direct or approximate
factorization.
\begin{itemize}
\item
{\tt sparsityflag = 0 (FRONTMTX\_DENSE\_FRONTS)} implies a direct
factorization, the fronts will be stored as dense submatrices.
\item
{\tt sparsityflag = 1 (FRONTMTX\_SPARSE\_FRONTS)} implies an
approximate factorization.
The fronts will be stored as sparse submatrices, where
the entries in the triangular factors will be
subjected to a drop tolerance test --- if the magnitude of an entry
is {\tt droptol} or larger, it will be stored, otherwise it will be
dropped.
\end{itemize}
\item
The {\tt pivotingflag} parameter signals whether pivoting for
stability will be enabled or not.
\begin{itemize}
\item
If {\tt pivotingflag = 0 (SPOOLES\_NO\_PIVOTING)},
no pivoting will be done.
\item
If {\tt pivotingflag = 1 (SPOOLES\_PIVOTING)},
pivoting will be done to ensure that all
entries in $U$ and $L$ have magnitude less than {\tt tau}.
\end{itemize}
\item
The {\tt tau} parameter is an upper bound on the magnitude of the
entries in $L$ and $U$ when pivoting is enabled.
\item
The {\tt droptol} parameter is a lower bound on the magnitude of the
entries in $L$ and $U$ when the approximate factorization is enabled.
\item
The {\tt nrhs} parameter is the number of right hand sides to solve
as one block.
\item
The {\tt nthread} parameter is the number of threads.
\item
The {\tt maptype} parameter determines the type of map from fronts
to processes to be used during the factorization
\begin{itemize}
\item 1 -- wrap map
\item 2 -- balanced map
\item 3 -- subtree-subset map
\item 4 -- domain decomposition map
\item 5 -- improved domain decomposition map
\end{itemize}
See the {\tt ETree} methods for constructing maps.
\item
The {\tt cutoff} parameter is used for domain decomposition maps.
We try to construct domains (each domain is owned by a single
thread) that contain $0 \le {\tt cutoff} \le 1$ of the rows and
columns of the matrix.
Try to choose {\tt cutoff} to be {\tt 1/nthread}
or {\tt 1/(2*nthread)}.
\item
The {\tt lookahead} parameter controls the degree that a thread
will look past a stalled front in order to do some useful work.
{\t lookahead = 0} implies a thread will not look ahead, while
{\tt lookahead = k} implies a thread will look {\tt k} ancestors up
the front tree to find useful work.
Bewarned, while a thread is doing useful work further up the tree,
the stalled front may be ready, so large values of lookahead can be
detrimental to a fast computation.
In addition, a positive value of {\tt lookahead} means a larger
storage footprint taken by the factorization.
\end{itemize}
%-----------------------------------------------------------------------
\item
\begin{verbatim}
testQRgridMT msglvl msgFile n1 n2 n3 seed nrhs type
             nthread maptype cutoff
\end{verbatim}
This driver program tests the serial {\tt FrontMtx\_QR\_factor()}
and {\tt FrontMtx\_QR\_solve()} methods for the least squares problem
$\min_X \| F - A X \|_F$.
The factorization and solve are done in parallel.
\par
\begin{itemize}
\item
The {\tt msglvl} parameter determines the amount of output.
Use {\tt msglvl = 1} for just timing output.
\item
The {\tt msgFile} parameter determines the message file --- if {\tt
msgFile} is {\tt stdout}, then the message file is {\it stdout},
otherwise a file is opened with {\it append} status to receive any
output data.
\item
{\tt n1} is the number of points in the first grid direction.
\item
{\tt n2} is the number of points in the second grid direction.
\item
{\tt n3} is the number of points in the third grid direction.
\item
The {\tt seed} parameter is a random number seed.
\item
The {\tt nrhs} parameter is the number of right hand sides to solve
as one block.
\item
The {\tt type} parameter specifies a real or complex linear system.
\begin{itemize}
\item
{\tt type = 1 (SPOOLES\_REAL)} for real,
\item
{\tt type = 2 (SPOOLES\_COMPLEX)} for complex.
\end{itemize}
\item
The {\tt nthread} parameter is the number of threads.
\item
The {\tt maptype} parameter determines the type of map from fronts
to processes to be used during the factorization
\begin{itemize}
\item 1 -- wrap map
\item 2 -- balanced map
\item 3 -- subtree-subset map
\item 4 -- domain decomposition map
\item 5 -- improved domain decomposition map
\end{itemize}
See the {\tt ETree} methods for constructing maps.
\item
The {\tt cutoff} parameter is used for domain decomposition maps.
We try to construct domains (each domain is owned by a single
thread) that contain $0 \le {\tt cutoff} \le 1$ of the rows and
columns of the matrix.
Try to choose {\tt cutoff} to be {\tt 1/nthread}
or {\tt 1/(2*nthread)}.
\end{itemize}
%-----------------------------------------------------------------------
%-----------------------------------------------------------------------
\end{enumerate}