File: Design-issues.Rnw

package info (click to toggle)
rmatrix 1.7-4-1
links: PTS, VCS
area: main
in suites: forky, sid
size: 12,096 kB
sloc: ansic: 97,203; makefile: 280; sh: 165
file content (237 lines) | stat: -rw-r--r-- 8,678 bytes
parent folder | download | duplicates (4)
\documentclass{article}
%
\usepackage{myVignette}
\usepackage[authoryear,round]{natbib}
\bibliographystyle{plainnat}
\newcommand{\noFootnote}[1]{{\small (\textit{#1})}}
\newcommand{\myOp}[1]{{$\left\langle\ensuremath{#1}\right\rangle$}}
%%                    vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
%%\VignetteIndexEntry{Design Issues in Matrix package Development}
%%\VignetteDepends{Matrix,utils}
\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=true,keep.source=TRUE}
%								          ^^^^^^^^^^^^^^^^
\title{Design Issues in Matrix package Development}
\author{Martin Maechler and Douglas Bates\\R Core Development Team
  \\\email{maechler@stat.math.ethz.ch}, \email{bates@r-project.org}}
\date{Spring 2008; Aug~2022 ({\tiny typeset on \tiny\today})}
%
\begin{document}
\maketitle
\begin{abstract}
This is a (\textbf{currently very incomplete}) write-up of the many smaller and
larger design decisions we have made in organizing functionalities in the
Matrix package.

Classes: There's a rich hierarchy of matrix classes, which you can
visualize as a set of trees whose inner (and ``upper'') nodes are
\emph{virtual} classes and only the leaves are non-virtual ``actual'' classes.

Functions and Methods:

- setAs()

- others

\end{abstract}
%% Note: These are explained in '?RweaveLatex' :
<<preliminaries, echo=FALSE>>=
options(width=75)
library(utils) # for R_DEFAULT_PACKAGES=NULL
library(Matrix)
@

\section{The Matrix class structures}
\label{sec:classes}

Take Martin's DSC 2007 talk to depict the Matrix class hierarchy;
available from {\small
  \url{https://stat.ethz.ch/~maechler/R/DSC-2007_MatrixClassHierarchies.pdf}} .
% ~/R/Meetings-Kurse-etc/2007-DSC/talk.tex Matrix-classes.Rnw

 --- --- --- %% \hrule[1pt]{\textwidth}

From far, there are \textbf{three} separate class hierarchies, and every \pkg{Matrix} package
matrix has an actual (or ``factual'') class inside these three hierarchies:
% ~/R/Meetings-Kurse-etc/2007-DSC/Matrix-classes.Rnw
More formally, we have three (\ 3 \ ) main ``class classifications'' for our Matrices, i.e.,\\
three ``orthogonal'' partitions of  ``Matrix space'', and every Matrix
object's class corresponds to an \emph{intersection} of these three partitions;
i.e., in R's S4 class system: We have three independent inheritance
schemes for every Matrix, and each such Matrix class is simply defined to
\texttt{contain} three \emph{virtual} classes (one from each partitioning
scheme), e.g,

The three partioning schemes are
\begin{enumerate}
\item Content \texttt{type}: Classes \code{dMatrix}, \code{lMatrix},
  \code{nMatrix},
  (\code{iMatrix}, \code{zMatrix}) for entries of type \textbf{d}ouble,
  \textbf{l}ogical, patter\textbf{n} (and not yet \textbf{i}nteger and
  complex) Matrices.

  \code{nMatrix} only stores the
  \emph{location} of non-zero matrix entries (where as logical Matrices
  can also have \code{NA} entries!)

\item structure: general, triangular, symmetric, diagonal Matrices

\item sparsity: \code{denseMatrix}, \code{sparseMatrix}

\end{enumerate}

For example in the most used sparseMatrix class, \code{"dgCMatrix"},
the three initial letters \code{dgC} each codes for one of the three hierarchies:
\begin{description}
\item{d: } \textbf{d}ouble
\item{g: } \textbf{g}eneral
\item{C: } \textbf{C}sparseMatrix, where \textbf{C} is for \textbf{C}olumn-compressed.
\end{description}
Part of this is visible from printing \code{getClass("\emph{<classname>}")}:
<<dgC-ex>>=
getClass("dgCMatrix")
@

Another example is the \code{"nsTMatrix"} class, where \code{nsT} stands for
\begin{description}
\item{n: } \textbf{n} is for ``patter\textbf{n}'', boolean content where
  only the \emph{locations} of the non-zeros need to be stored.
\item{t: } \textbf{t}riangular matrix; either \textbf{U}pper, or \textbf{L}ower.
\item{T: } \textbf{T}sparseMatrix, where \textbf{T} is for \textbf{T}riplet,
  the simplest but least efficient way to store a sparse matrix.
\end{description}
From R itself, via \code{getClass(.)}:
<<dgC-ex>>=
getClass("ntTMatrix")
@


\subsection{Diagonal Matrices}
\label{ssec:diagMat}
The class of diagonal matrices is worth mentioning for several reasons.
First, we have wanted such a class, because \emph{multiplication}
methods are particularly simple with diagonal matrices.
The typical constructor is \Rfun{Diagonal} whereas the accessor
(as for traditional matrices), \Rfun{diag} simply returns the
\emph{vector} of diagonal entries:
<<diag-class>>=
(D4 <- Diagonal(4, 10*(1:4)))
str(D4)
diag(D4)
@
We can \emph{modify} the diagonal in the traditional way
(via method definition for \Rfun{diag<-}):
<<diag-2>>=
diag(D4) <- diag(D4) + 1:4
D4
@

Note that \textbf{unit-diagonal} matrices (the identity matrices of linear algebra)
with slot \code{diag = "U"} can have an empty \code{x} slot, very
analogously to the unit-diagonal triangular matrices:
<<unit-diag>>=
str(I3 <- Diagonal(3)) ## empty 'x' slot

getClass("diagonalMatrix") ## extending "sparseMatrix"
@
Originally, we had implemented diagonal matrices as \emph{dense} rather than sparse
matrices.  After several years it became clear that this had not been
helpful really both from a user and programmer point of view.
So now, indeed the \code{"diagonalMatrix"} class does also extend
\code{"sparseMatrix"}, i.e., is a subclass of it.
However, we do \emph{not} store explicitly
where the non-zero entries are, and the class does \emph{not} extend any of
the typical sparse matrix classes, \code{"CsparseMatrix"},
\code{"TsparseMatrix"}, or \code{"RsparseMatrix"}.
Rather, the \code{diag()}onal (vector) is the basic part of such a matrix,
and this is simply the \code{x} slot unless the \code{diag} slot is \code{"U"},
the unit-diagonal case, which is the identity matrix.

Further note, e.g., from the \code{?$\,$Diagonal} help page, that we provide
(low level) utility function
\code{.sparseDiagonal()} with wrappers
\code{.symDiagonal()} and \code{.trDiagonal()} which will provide diagonal
matrices inheriting from \code{"CsparseMatrix"} which may be advantageous
in \emph{some cases}, but less efficient in others, see the help page.


\section{Matrix Transformations}
\label{sec:trafos}

\subsection{Coercions between Matrix classes}
\label{ssec:coerce}

You may need to transform Matrix objects into specific shape (triangular,
symmetric), content type (double, logical, \dots) or storage structure
(dense or sparse).
Every useR should use \code{as(x, <superclass>)} to this end, where
\code{<superclass>} is a \emph{virtual} Matrix super class, such as
\code{"triangularMatrix"} \code{"dMatrix"}, or \code{"sparseMatrix"}.

In other words, the user should \emph{not} coerce directly to a specific
desired class such as \code{"dtCMatrix"}, even though that may
occasionally work as well.

Here is a set of rules to which the Matrix developers and the users
should typically adhere:
\begin{description}

\item[Rule~1]:  \code{as(M, "matrix")} should work for \textbf{all} Matrix
  objects \code{M}.

\item[Rule~2]:  \code{Matrix(x)} should also work for matrix like
objects \code{x} and always return a ``classed'' Matrix.

Applied to a \code{"matrix"} object \code{m}, \code{M. <- Matrix(m)} can be
considered a kind of inverse of \code{m <- as(M, "matrix")}.
For sparse matrices however, \code{M.} well be a
\code{CsparseMatrix}, and it is often ``more structured'' than \code{M},
e.g.,
<<Matrix-ex>>=
(M <- spMatrix(4,4, i=1:4, j=c(3:1,4), x=c(4,1,4,8))) # dgTMatrix
m <- as(M, "matrix")
(M. <- Matrix(m)) # dsCMatrix (i.e. *symmetric*)
@


\item[Rule~3]: All the following coercions to \emph{virtual} matrix
  classes should work:\\
  \begin{enumerate}
  \item \code{as(m, "dMatrix")}
  \item \code{as(m, "lMatrix")}
  \item \code{as(m, "nMatrix")}

  \item \code{as(m, "denseMatrix")}
  \item \code{as(m, "sparseMatrix")}

  \item \code{as(m, "generalMatrix")}
  \end{enumerate}
  whereas the next ones should work under some assumptions:

  \begin{enumerate}
  \item \code{as(m1, "triangularMatrix")} \\
       should work when \code{m1} is a triangular matrix, i.e. the upper or
       lower triangle of \code{m1} contains only zeros.

  \item \code{as(m2, "symmetricMatrix")}
       should work when \code{m2} is a symmetric matrix in the sense of
       \code{isSymmetric(m2)} returning \code{TRUE}.
       Note that this is typically equivalent to something like
       \code{isTRUE(all.equal(m2, t(m2)))}, i.e., the lower and upper
       triangle of the matrix have to be equal \emph{up to small
       numeric fuzz}.
  \end{enumerate}

\end{description}



\section{Session Info}

<<sessionInfo, results=tex>>=
toLatex(sessionInfo())
@

%not yet
%\bibliography{Matrix}

\end{document}