1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249
|
\vfill \eject
\par
\section{Serial Solution of $A X = Y$ using an $QR$ factorization}
\label{section:QR-serial}
\par
Let us review the steps is solving $A X = Y$ using a $QR$
factorization.
\begin{itemize}
\item
{\bf communicate} the data for the problem as $A$, $X$ and $Y$.
\item
{\bf reorder} as
${\widetilde A} {\widetilde X} = Y$, where
${\widetilde A} = A P^T$ and
${\widetilde X} = P X$.
and $P$ is a permutation matrix.
\item
{\bf factor} $ {\widetilde A} = Q R$,
where $Q$ is orthogonal and $R$ is upper triangular.
\item
{\bf solve} $R^T R (P X) = A^T Y$ (if real)
or {\bf solve} $R^H R (P X) = A^H Y$ (if complex).
\end{itemize}
\par
A complete listing of a sample program
is found in Section~\ref{section:QR-serial-driver}.
We will now begin to
work our way through the program to illustrate the use
of {\bf SPOOLES} to solve a system of linear equations.
\par
\subsection{Reading the input parameters}
\label{subsection:QR:input-data}
\par
The input parameters are identical to those of the serial $LU$
driver program described in
Section~\ref{subsection:serial:input-data}
with the exception that the {\tt symmetryflag} is not present.
\par
\subsection{Communicating the data for the problem}
\label{subsection:QR:communicating-data}
\par
This step is identical to the serial code, as described in
Section~\ref{subsection:serial:communicating-data}
\par
\subsection{Reordering the linear system}
\label{subsection:QR:reordering}
For the $LU$ factorization of $A$, we used the graph of $A + A^T$.
For the $QR$ factorization of $A$, we need the graph of $A^TA$.
The only difference between the two orderings is how we create
the {\tt IVL} object for the graph.
For the $QR$ factorization, we use
{\tt InpMtx\_adjForATA()}, as we see below.
\begin{verbatim}
adjIVL = InpMtx_adjForATA(mtxA) ;
nedges = IVL_tsize(adjIVL) ;
graph = Graph_new() ;
Graph_init2(graph, 0, neqns, 0, nedges, neqns, nedges, adjIVL,
NULL, NULL) ;
frontETree = orderViaMMD(graph, seed, msglvl, msgFile) ;
\end{verbatim}
The minimum degree method is the simplest of the ordering methods
provided in the {\bf SPOOLES} library.
For more information on ordering, please see the user document
{\it ``Ordering Sparse Matrices and Transforming Front Trees''}.
\par
\subsection{Non-numeric work}
\label{subsection:QR:non-numeric}
\par
The next phase is to obtain the permutation matrix $P$, (stored
implicitly in a permutation vector), and apply it to the matrix $A$.
This is done by the following code fragment.
\begin{verbatim}
oldToNewIV = ETree_oldToNewVtxPerm(frontETree) ;
oldToNew = IV_entries(oldToNewIV) ;
newToOldIV = ETree_newToOldVtxPerm(frontETree) ;
newToOld = IV_entries(newToOldIV) ;
InpMtx_permute(mtxA, NULL, oldToNew)) ;
InpMtx_changeStorageMode(mtxA, INPMTX_BY_VECTORS) ;
\end{verbatim}
The {\tt oldToNewIV} and {\tt newToOldIV} variables are {\tt IV}
objects that represent an integer vector.
The {\tt oldToNew} and {\tt newToOld} variables are pointers to
{\tt int}, which point to the base address of the {\tt int} vector
in an {\tt IV} object.
Once we have the permutation vector, we apply it to the front tree,
by the {\tt ETree\_permuteVertices()} method.
We need $A P^T$, so we permute the {\tt InpMtx}
object using a {\tt NULL} pointer for the row permutation (which
means do not permute the rows) and the {\tt oldToNew} vector for
the column permutation.
At this point the {\tt InpMtx} object holds $AP^T$ in the form
required by the factorization.
\par
The final steps are to compute the symbolic factorization,
which is stored in an {\tt IVL} object, and to permute the vertices
in the front tree.
The symbolic factorization differs slightly from the $LU$ case.
\begin{verbatim}
symbfacIVL = SymbFac_initFromGraph(frontETree, graph) ;
IVL_overwrite(symbfacIVL, oldToNewIV) ;
IVL_sortUp(symbfacIVL) ;
ETree_permuteVertices(frontETree, oldToNewIV) ;
\end{verbatim}
We do not have the $A^TA$ matrix object, so we constuct the
symbolic factorization using the front tree and the {\tt Graph} object.
Note, at this point in time, both the graph and front tree are in
terms of the original ordering, so after the {\tt IVL} object is
created, its vertices must be mapped into the new permutation and
sorted into ascending order.
Then the vertices in the front tree are mapped into the new ordering.
\par
\subsection{The Matrix Factorization}
\label{subsection:QR:factor}
\par
The numeric factorization step begins by initializing the {\tt
FrontMtx} object with the {\tt frontETree} and {\tt symbacIVL}
objects created in early steps.
The {\tt FrontMtx} object holds the actual factorization.
The code segment for the initialization is found below.
\begin{verbatim}
frontmtx = FrontMtx_new() ;
mtxmanager = SubMtxManager_new() ;
SubMtxManager_init(mtxmanager, NO_LOCK, 0) ;
if ( type == SPOOLES_REAL ) {
FrontMtx_init(frontmtx, frontETree, symbfacIVL, type,
SPOOLES_SYMMETRIC, FRONTMTX_DENSE_FRONTS,
SPOOLES_NO_PIVOTING, NO_LOCK, 0, NULL,
mtxmanager, msglvl, msgFile) ;
} else {
FrontMtx_init(frontmtx, frontETree, symbfacIVL, type,
SPOOLES_HERMITIAN, FRONTMTX_DENSE_FRONTS,
SPOOLES_NO_PIVOTING, NO_LOCK, 0, NULL,
mtxmanager, msglvl, msgFile) ;
}
\end{verbatim}
This differs little from the initialization in
Section~\ref{subsection:serial:factor}, except that the matrix type
is symmetric or Hermitian, and no pivoting is used for stability.
\par
The numeric factorization is performed by the
{\tt FrontMtx\_QR\_factor()} method.
The code segment from the sample program for the numerical
factorization step is found below.
\begin{verbatim}
chvmanager = ChvManager_new() ;
ChvManager_init(chvmanager, NO_LOCK, 1) ;
DVzero(10, cpus) ;
facops = 0.0 ;
FrontMtx_QR_factor(frontmtx, mtxA, chvmanager, cpus, &facops, msglvl, msgFile) ;
ChvManager_free(chvmanager) ;
\end{verbatim}
Working storage used during the factorization is found in the form
of block {\it chevrons}, in a {\tt Chv} object, which hold the partial
frontal matrix for a front.
Much as with the {\tt SubMtx} object, the {\tt FrontMtx} object does
not concern itself with managing working storage, instead it relies
on a {\tt ChvManager} object to manage the {\tt Chv} objects.
On return {\tt facops} contains the number of floating point
operations performed during the factorization.
\par
The factorization is performed using a one-dimensional
decomposition of the factor matrices.
Keeping the factor matrices in this form severely limits the amount
of parallelism for the forward and backsolves.
We perform a post-processing step to convert the one-dimensional
data structures to submatrices of a two-dimensional block
decomposition of the factor matrices.
The following code fragment performs this operation.
\begin{verbatim}
FrontMtx_postProcess(frontmtx, msglvl, msgFile) ;
\end{verbatim}
\par
\subsection{Solving the linear system}
\label{subsection:QR:solve}
\par
The following code fragment solves the linear system
$R^T R {\widehat X} = {\widehat A}^T Y$ if real
or
$R^H R {\widehat X} = {\widehat A}^H Y$ if complex.
\begin{verbatim}
mtxX = DenseMtx_new() ;
DenseMtx_init(mtxX, type, 0, 0, neqns, nrhs, 1, neqns) ;
FrontMtx_QR_solve(frontmtx, mtxA, mtxX, mtxB, mtxmanager,
cpus, msglvl, msgFile) ;
\end{verbatim}
Last, we permute the rows of ${\tt widehat X}$ back into $X$.
\begin{verbatim}
DenseMtx_permuteRows(mtxX, newToOldIV) ;
\end{verbatim}
\par
\subsection{Sample Matrix and Right Hand Side Files}
\label{subsection:QR:input-files}
\par
Immediately below are two sample files:
{\tt qr.matrix.input} holds the matrix input
and {\tt qr.rhs.input} holds the right hand side.
This simple example is an $8 \times 6$ matrix $A$
and a single right hand side.
The solution is the vector of all ones.
Note how the indices are zero-based as for C, instead of one-based
as for Fortran.
\begin{center}
\begin{tabular}{|l|}
\multicolumn{1}{c}{\tt matrix.input} \\ \hline
\begin{minipage}[t]{0.5 in}
\begin{verbatim}
8 6 18
0 1 1.0
0 3 2.0
1 2 3.0
1 3 1.0
1 5 1.0
2 0 1.0
2 2 2.0
3 0 3.0
3 2 4.0
3 4 2.0
4 3 1.0
5 1 2.0
5 4 3.0
5 5 1.0
6 0 2.0
6 3 3.0
7 1 1.0
7 4 3.0
\end{verbatim}
\end{minipage}
\\ \hline
\end{tabular}
\qquad
\begin{tabular}{|l|}
\multicolumn{1}{c}{\tt rhs.input} \\ \hline
\begin{minipage}[t]{0.5 in}
\begin{verbatim}
8 1
0 3.0
1 5.0
2 3.0
3 9.0
4 1.0
5 6.0
6 5.0
7 4.0
\end{verbatim}
\end{minipage}
\\ \hline
\end{tabular}
\end{center}
\par
|