1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
|
--- cinterface.tex.orig Thu Mar 10 17:02:44 2005
+++ cinterface.tex Thu Mar 10 17:05:51 2005
@@ -1,3 +1,10 @@
+\documentclass{article}
+\oddsidemargin 0.25in
+\textwidth 6.0in
+\usepackage{fancyvrb}
+
+\begin{document}
+
%
% cinterface.tex
%
@@ -100,13 +107,13 @@
two dimensional arrays, {\tt ORDER}. The standard dictates the following
enumerated types:
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
enum CBLAS_ORDER {CblasRowMajor=101, CblasColMajor=102};
enum CBLAS_TRANSPOSE {CblasNoTrans=111, CblasTrans=112, CblasConjTrans=113};
enum CBLAS_UPLO {CblasUpper=121, CblasLower=122};
enum CBLAS_DIAG {CblasNonUnit=131, CblasUnit=132};
enum CBLAS_SIDE {CblasLeft=141, CblasRight=142};
-\end{verbatim}
+\end{Verbatim}
\subsubsection{Considered methods}
{\it
@@ -227,22 +234,22 @@
{\it
\begin{quotation}
This is the area where use of a structure is most desired. Again, the most
-common suggestion is a structure such as
+common suggestion is a structure such as\\
\verb+struct NON_PORTABLE_COMPLEX {float r; float i;};+.
If one is willing to use this structure throughout one's code, then this
provides a natural and convenient mechanism. If, however, the programmer has
utilized a different structure for complex, this ease of use breaks down. Then,
something like the following code fragment is required:
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
NON_PORTABLE_COMPLEX ctmp;
float cdot[2];
ctmp = cblas_cdotc(n, x, 1, y, 1);
cdot[0] = ctmp.r;
cdot[1] = ctmp.i;
-\end{verbatim}
-which is certainly much less convenient than:
+\end{Verbatim}
+which is certainly much less convenient than:\\
\verb+cblas_cdotc_sub(n, x, 1, y, 1, cdot)+.
It should also be noted that the primary reason for having a function instead
@@ -348,13 +355,13 @@
and column $j$,
we must allocate $m$ pointers, and assign them in a section of code such as:
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
float **A, **subA;
subA = malloc(m*sizeof(float*));
for (k=0; k != m; k++) subA[k] = A[i+k] + j;
cblas_rout(... subA ...);
-\end{verbatim}
+\end{Verbatim}
The same operation must be done if we wish to use a row or column as a vector.
This is not only an inconvenience, but can add up to a non-negligible
@@ -381,35 +388,35 @@
\noindent
{\bf Example 1: making a library call with a C 2D array:}
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
double A[50][25]; /* standard C 2D array */
cblas_rout(CblasRowMajor, ... &A[i][j], 25, ...);
-\end{verbatim}
-
+\end{Verbatim}
+
\noindent
{\bf Example 2: Legal use of pointer to pointer style programming and the CBLAS}
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
double **A, *p;
- A = malloc(M);
+ A = malloc(M*sizeof(double *));
p = malloc(M*N*sizeof(double));
for (i=0; i < M; i++) A[i] = &p[i*N];
cblas_rout(CblasRowMajor, ... &A[i][j], N, ...);
-\end{verbatim}
+\end{Verbatim}
\noindent
{\bf Example 3: Illegal use of pointer to pointer style programming and the CBLAS}
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
double **A, *p;
- A = malloc(M);
+ A = malloc(M*sizeof(double *));
p = malloc(M*N*sizeof(double));
for (i=0; i < M; i++) A[i] = malloc(N*sizeof(double));
cblas_rout(CblasRowMajor, ... &A[i][j], N, ...);
-\end{verbatim}
+\end{Verbatim}
Note that Example 3 is illegal because the rows of A have no guaranteed stride.
@@ -506,7 +513,7 @@
%{\footnotesize {\bf Current Status:}\\
%First vote taken on section~\ref{sec-cblash}, passed 9/0/1 with 16 eligible voters, 8/98.}\\
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
#ifndef CBLAS_H
#define CBLAS_H
#include <stddef.h>
@@ -1072,7 +1079,7 @@
void *C, const int ldc);
#endif
-\end{verbatim}
+\end{Verbatim}
\subsection{Using Fortran 77 BLAS to support row-major BLAS operations}
\label{app-ArrayStore}
@@ -1541,7 +1548,7 @@
rows contain the subdiagonals.
{\samepage
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
------ Super diagonal KU
----------- Super diagonal 2
------------ Super diagonal 1
@@ -1549,7 +1556,7 @@
------------ Sub diagonal 1
----------- Sub diagonal 2
------ Sub diagonal KL
-\end{verbatim}
+\end{Verbatim}
}
If we have a row-major storage, and thus a row-oriented algorithm, we will
@@ -1559,7 +1566,7 @@
to line up with the main diagonal along rows.
{\samepage
-\begin{verbatim}
+\begin{Verbatim}[fontsize=\small,fontfamily=tt,fontshape=rm]
KL D KU
| | | |
| | | | |
@@ -1567,7 +1574,7 @@
| | | | | |
| | | | |
| | | |
-\end{verbatim}
+\end{Verbatim}
}
Now, let us contrast these two storage schemes. Both store
@@ -1675,3 +1682,4 @@
With these ideas in mind, the analysis for the dense routines may be used
unchanged for packed.
\clearpage
+\end{document}
|