Next: , Previous: Robust linear regression, Up: Least-Squares Fitting   [Index]

38.6 Large dense linear systems

This module is concerned with solving large dense least squares systems X c = y where the n-by-p matrix X has n >> p (ie: many more rows than columns). This type of matrix is called a “tall skinny” matrix, and for some applications, it may not be possible to fit the entire matrix in memory at once to use the standard SVD approach. Therefore, the algorithms in this module are designed to allow the user to construct smaller blocks of the matrix X and accumulate those blocks into the larger system one at a time. The algorithms in this module never need to store the entire matrix X in memory. The large linear least squares routines support data weights and Tikhonov regularization, and are designed to minimize the residual

\chi^2 = || y - Xc ||_W^2 + \lambda^2 || L c ||^2

where y is the n-by-1 observation vector, X is the n-by-p design matrix, c is the p-by-1 solution vector, W = diag(w_1,...,w_n) is the data weighting matrix, L is an m-by-p regularization matrix, \lambda is a regularization parameter, and ||r||_W^2 = r^T W r. In the discussion which follows, we will assume that the system has been converted into Tikhonov standard form,

\chi^2 = || y~ - X~ c~ ||^2 + \lambda^2 || c~ ||^2

and we will drop the tilde characters from the various parameters. For a discussion of the transformation to standard form see Regularized regression.

The basic idea is to partition the matrix X and observation vector y as

[ X_1 ] c = [ y_1 ]
[ X_2 ]     [ y_2 ]
[ X_3 ]     [ y_3 ]
[ ... ]     [ ... ]
[ X_k ]     [ y_k ]

into k blocks, where each block (X_i,y_i) may have any number of rows, but each X_i has p columns. The sections below describe the methods available for solving this partitioned system. The functions are declared in the header file gsl_multilarge.h.

Next: , Previous: Robust linear regression, Up: Least-Squares Fitting   [Index]