File: Robust-linear-regression.html

package info (click to toggle)
gsl-ref-html 2.3-1
  • links: PTS
  • area: non-free
  • in suites: bullseye, buster, sid
  • size: 6,876 kB
  • ctags: 4,574
  • sloc: makefile: 35
file content (354 lines) | stat: -rw-r--r-- 19,775 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 The GSL Team.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being "GNU General Public License" and "Free Software
Needs Free Documentation", the Front-Cover text being "A GNU Manual",
and with the Back-Cover Text being (a) (see below). A copy of the
license is included in the section entitled "GNU Free Documentation
License".

(a) The Back-Cover Text is: "You have the freedom to copy and modify this
GNU Manual." -->
<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ -->
<head>
<title>GNU Scientific Library &ndash; Reference Manual: Robust linear regression</title>

<meta name="description" content="GNU Scientific Library &ndash; Reference Manual: Robust linear regression">
<meta name="keywords" content="GNU Scientific Library &ndash; Reference Manual: Robust linear regression">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link href="index.html#Top" rel="start" title="Top">
<link href="Function-Index.html#Function-Index" rel="index" title="Function Index">
<link href="Least_002dSquares-Fitting.html#Least_002dSquares-Fitting" rel="up" title="Least-Squares Fitting">
<link href="Large-Dense-Linear-Systems.html#Large-Dense-Linear-Systems" rel="next" title="Large Dense Linear Systems">
<link href="Regularized-regression.html#Regularized-regression" rel="previous" title="Regularized regression">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.indentedblock {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
div.smalllisp {margin-left: 3.2em}
kbd {font-style:oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nocodebreak {white-space:nowrap}
span.nolinebreak {white-space:nowrap}
span.roman {font-family:serif; font-weight:normal}
span.sansserif {font-family:sans-serif; font-weight:normal}
ul.no-bullet {list-style: none}
-->
</style>


</head>

<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
<a name="Robust-linear-regression"></a>
<div class="header">
<p>
Next: <a href="Large-Dense-Linear-Systems.html#Large-Dense-Linear-Systems" accesskey="n" rel="next">Large Dense Linear Systems</a>, Previous: <a href="Regularized-regression.html#Regularized-regression" accesskey="p" rel="previous">Regularized regression</a>, Up: <a href="Least_002dSquares-Fitting.html#Least_002dSquares-Fitting" accesskey="u" rel="up">Least-Squares Fitting</a> &nbsp; [<a href="Function-Index.html#Function-Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Robust-linear-regression-1"></a>
<h3 class="section">38.5 Robust linear regression</h3>
<a name="index-robust-regression"></a>
<a name="index-regression_002c-robust"></a>
<a name="index-least-squares_002c-robust"></a>

<p>Ordinary least squares (OLS) models are often heavily influenced by the presence of outliers.
Outliers are data points which do not follow the general trend of the other observations,
although there is strictly no precise definition of an outlier. Robust linear regression
refers to regression algorithms which are robust to outliers. The most common type of
robust regression is M-estimation. The general M-estimator minimizes the objective function
</p>
<div class="example">
<pre class="example">\sum_i \rho(e_i) = \sum_i \rho (y_i - Y(c, x_i))
</pre></div>

<p>where <em>e_i = y_i - Y(c, x_i)</em> is the residual of the ith data point, and
<em>\rho(e_i)</em> is a function which should have the following properties:
</p><ul class="no-bullet">
<li><!-- /@w --> <em>\rho(e) \ge 0</em>
</li><li><!-- /@w --> <em>\rho(0) = 0</em>
</li><li><!-- /@w --> <em>\rho(-e) = \rho(e)</em>
</li><li><!-- /@w --> <em>\rho(e_1) &gt; \rho(e_2)</em> for <em>|e_1| &gt; |e_2|</em>
</li></ul>
<p>The special case of ordinary least squares is given by <em>\rho(e_i) = e_i^2</em>.
Letting <em>\psi = \rho'</em> be the derivative of <em>\rho</em>, differentiating
the objective function with respect to the coefficients <em>c</em>
and setting the partial derivatives to zero produces the system of equations
</p>
<div class="example">
<pre class="example">\sum_i \psi(e_i) X_i = 0
</pre></div>

<p>where <em>X_i</em> is a vector containing row <em>i</em> of the design matrix <em>X</em>.
Next, we define a weight function <em>w(e) = \psi(e)/e</em>, and let
<em>w_i = w(e_i)</em>:
</p>
<div class="example">
<pre class="example">\sum_i w_i e_i X_i = 0
</pre></div>

<p>This system of equations is equivalent to solving a weighted ordinary least squares
problem, minimizing <em>\chi^2 = \sum_i w_i e_i^2</em>. The weights however, depend
on the residuals <em>e_i</em>, which depend on the coefficients <em>c</em>, which depend
on the weights. Therefore, an iterative solution is used, called Iteratively Reweighted
Least Squares (IRLS).
</p><ol>
<li> Compute initial estimates of the coefficients <em>c^{(0)}</em> using ordinary least squares

</li><li> For iteration <em>k</em>, form the residuals <em>e_i^{(k)} = (y_i - X_i c^{(k-1)})/(t \sigma^{(k)} \sqrt{1 - h_i})</em>,
where <em>t</em> is a tuning constant depending on the choice of <em>\psi</em>, and <em>h_i</em> are the
statistical leverages (diagonal elements of the matrix <em>X (X^T X)^{-1} X^T</em>). Including <em>t</em>
and <em>h_i</em> in the residual calculation has been shown to improve the convergence of the method.
The residual standard deviation is approximated as <em>\sigma^{(k)} = MAD / 0.6745</em>, where MAD is the
Median-Absolute-Deviation of the <em>n-p</em> largest residuals from the previous iteration.

</li><li> Compute new weights <em>w_i^{(k)} = \psi(e_i^{(k)})/e_i^{(k)}</em>.

</li><li> Compute new coefficients <em>c^{(k)}</em> by solving the weighted least squares problem with
weights <em>w_i^{(k)}</em>.

</li><li> Steps 2 through 4 are iterated until the coefficients converge or until some maximum iteration
limit is reached. Coefficients are tested for convergence using the critera:

<div class="example">
<pre class="example">|c_i^(k) - c_i^(k-1)| \le \epsilon \times max(|c_i^(k)|, |c_i^(k-1)|)
</pre></div>

<p>for all <em>0 \le i &lt; p</em> where <em>\epsilon</em> is a small tolerance factor.
</p></li></ol>
<p>The key to this method lies in selecting the function <em>\psi(e_i)</em> to assign
smaller weights to large residuals, and larger weights to smaller residuals. As
the iteration proceeds, outliers are assigned smaller and smaller weights, eventually
having very little or no effect on the fitted model.
</p>
<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005falloc"></a>Function: <em>gsl_multifit_robust_workspace *</em> <strong>gsl_multifit_robust_alloc</strong> <em>(const gsl_multifit_robust_type * <var>T</var>, const size_t <var>n</var>, const size_t <var>p</var>)</em></dt>
<dd><a name="index-gsl_005fmultifit_005frobust_005fworkspace"></a>
<p>This function allocates a workspace for fitting a model to <var>n</var>
observations using <var>p</var> parameters. The size of the workspace
is <em>O(np + p^2)</em>. The type <var>T</var> specifies the
function <em>\psi</em> and can be selected from the following choices.
</p><dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fdefault"></a>Robust type: <strong>gsl_multifit_robust_default</strong></dt>
<dd><p>This specifies the <code>gsl_multifit_robust_bisquare</code> type (see below) and is a good
general purpose choice for robust regression.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fbisquare"></a>Robust type: <strong>gsl_multifit_robust_bisquare</strong></dt>
<dd><p>This is Tukey&rsquo;s biweight (bisquare) function and is a good general purpose choice for
robust regression. The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = (1 - e^2)^2
</pre></div>

<p>and the default tuning constant is <em>t = 4.685</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fcauchy"></a>Robust type: <strong>gsl_multifit_robust_cauchy</strong></dt>
<dd><p>This is Cauchy&rsquo;s function, also known as the Lorentzian function.
This function does not guarantee a unique solution,
meaning different choices of the coefficient vector <var>c</var>
could minimize the objective function. Therefore this option should
be used with care. The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = 1 / (1 + e^2)
</pre></div>

<p>and the default tuning constant is <em>t = 2.385</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005ffair"></a>Robust type: <strong>gsl_multifit_robust_fair</strong></dt>
<dd><p>This is the fair <em>\rho</em> function, which guarantees a unique solution and
has continuous derivatives to three orders. The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = 1 / (1 + |e|)
</pre></div>

<p>and the default tuning constant is <em>t = 1.400</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fhuber"></a>Robust type: <strong>gsl_multifit_robust_huber</strong></dt>
<dd><p>This specifies Huber&rsquo;s <em>\rho</em> function, which is a parabola in the vicinity of zero and
increases linearly for a given threshold <em>|e| &gt; t</em>. This function is also considered
an excellent general purpose robust estimator, however, occasional difficulties can
be encountered due to the discontinuous first derivative of the <em>\psi</em> function.
The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = 1/max(1,|e|)
</pre></div>

<p>and the default tuning constant is <em>t = 1.345</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fols"></a>Robust type: <strong>gsl_multifit_robust_ols</strong></dt>
<dd><p>This specifies the ordinary least squares solution, which can be useful for quickly
checking the difference between the various robust and OLS solutions. The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = 1
</pre></div>

<p>and the default tuning constant is <em>t = 1</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fwelsch"></a>Robust type: <strong>gsl_multifit_robust_welsch</strong></dt>
<dd><p>This specifies the Welsch function which can perform well in cases where the residuals have an
exponential distribution. The weight function is given by
</p>
<div class="example">
<pre class="example">w(e) = \exp(-e^2)
</pre></div>

<p>and the default tuning constant is <em>t = 2.985</em>.
</p></dd></dl>
</dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005ffree"></a>Function: <em>void</em> <strong>gsl_multifit_robust_free</strong> <em>(gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function frees the memory associated with the workspace <var>w</var>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fname"></a>Function: <em>const char *</em> <strong>gsl_multifit_robust_name</strong> <em>(const gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function returns the name of the robust type <var>T</var> specified to <code>gsl_multifit_robust_alloc</code>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005ftune"></a>Function: <em>int</em> <strong>gsl_multifit_robust_tune</strong> <em>(const double <var>tune</var>, gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function sets the tuning constant <em>t</em> used to adjust the residuals at each iteration to <var>tune</var>.
Decreasing the tuning constant increases the downweight assigned to large residuals, while increasing
the tuning constant decreases the downweight assigned to large residuals.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fmaxiter"></a>Function: <em>int</em> <strong>gsl_multifit_robust_maxiter</strong> <em>(const size_t <var>maxiter</var>, gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function sets the maximum number of iterations in the iteratively
reweighted least squares algorithm to <var>maxiter</var>. By default,
this value is set to 100 by <code>gsl_multifit_robust_alloc</code>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fweights"></a>Function: <em>int</em> <strong>gsl_multifit_robust_weights</strong> <em>(const gsl_vector * <var>r</var>, gsl_vector * <var>wts</var>, gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function assigns weights to the vector <var>wts</var> using the residual vector <var>r</var> and
previously specified weighting function. The output weights are given by <em>wts_i = w(r_i / (t \sigma))</em>,
where the weighting functions <em>w</em> are detailed in <code>gsl_multifit_robust_alloc</code>. <em>\sigma</em>
is an estimate of the residual standard deviation based on the Median-Absolute-Deviation and <em>t</em>
is the tuning constant. This
function is useful if the user wishes to implement their own robust regression rather than using
the supplied <code>gsl_multifit_robust</code> routine below.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust"></a>Function: <em>int</em> <strong>gsl_multifit_robust</strong> <em>(const gsl_matrix * <var>X</var>, const gsl_vector * <var>y</var>, gsl_vector * <var>c</var>, gsl_matrix * <var>cov</var>, gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function computes the best-fit parameters <var>c</var> of the model
<em>y = X c</em> for the observations <var>y</var> and the matrix of
predictor variables <var>X</var>, attemping to reduce the influence
of outliers using the algorithm outlined above.
The <em>p</em>-by-<em>p</em> variance-covariance matrix of the model parameters
<var>cov</var> is estimated as <em>\sigma^2 (X^T X)^{-1}</em>, where <em>\sigma</em> is
an approximation of the residual standard deviation using the theory of robust
regression. Special care must be taken when estimating <em>\sigma</em> and
other statistics such as <em>R^2</em>, and so these
are computed internally and are available by calling the function
<code>gsl_multifit_robust_statistics</code>.
</p>
<p>If the coefficients do not converge within the maximum iteration
limit, the function returns <code>GSL_EMAXITER</code>. In this case,
the current estimates of the coefficients and covariance matrix
are returned in <var>c</var> and <var>cov</var> and the internal fit statistics
are computed with these estimates.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fest"></a>Function: <em>int</em> <strong>gsl_multifit_robust_est</strong> <em>(const gsl_vector * <var>x</var>, const gsl_vector * <var>c</var>, const gsl_matrix * <var>cov</var>, double * <var>y</var>, double * <var>y_err</var>)</em></dt>
<dd><p>This function uses the best-fit robust regression coefficients
<var>c</var> and their covariance matrix
<var>cov</var> to compute the fitted function value
<var>y</var> and its standard deviation <var>y_err</var> for the model <em>y = x.c</em> 
at the point <var>x</var>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fresiduals"></a>Function: <em>int</em> <strong>gsl_multifit_robust_residuals</strong> <em>(const gsl_matrix * <var>X</var>, const gsl_vector * <var>y</var>, const gsl_vector * <var>c</var>, gsl_vector * <var>r</var>, gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function computes the vector of studentized residuals
<em>r_i = {y_i - (X c)_i \over \sigma \sqrt{1 - h_i}}</em> for
the observations <var>y</var>, coefficients <var>c</var> and matrix of predictor
variables <var>X</var>. The routine <code>gsl_multifit_robust</code> must
first be called to compute the statisical leverages <em>h_i</em> of
the matrix <var>X</var> and residual standard deviation estimate <em>\sigma</em>.
</p></dd></dl>

<dl>
<dt><a name="index-gsl_005fmultifit_005frobust_005fstatistics"></a>Function: <em>gsl_multifit_robust_stats</em> <strong>gsl_multifit_robust_statistics</strong> <em>(const gsl_multifit_robust_workspace * <var>w</var>)</em></dt>
<dd><p>This function returns a structure containing relevant statistics from a robust regression. The function
<code>gsl_multifit_robust</code> must be called first to perform the regression and calculate these statistics.
The returned <code>gsl_multifit_robust_stats</code> structure contains the following fields.
</p><ul class="no-bullet">
<li><!-- /@w --> double <code>sigma_ols</code> This contains the standard deviation of the residuals as computed from ordinary least squares (OLS).

</li><li><!-- /@w --> double <code>sigma_mad</code> This contains an estimate of the standard deviation of the final residuals using the Median-Absolute-Deviation statistic

</li><li><!-- /@w --> double <code>sigma_rob</code> This contains an estimate of the standard deviation of the final residuals from the theory of robust regression (see Street et al, 1988).

</li><li><!-- /@w --> double <code>sigma</code> This contains an estimate of the standard deviation of the final residuals by attemping to reconcile <code>sigma_rob</code> and <code>sigma_ols</code>
in a reasonable way.

</li><li><!-- /@w --> double <code>Rsq</code> This contains the <em>R^2</em> coefficient of determination statistic using the estimate <code>sigma</code>.

</li><li><!-- /@w --> double <code>adj_Rsq</code> This contains the adjusted <em>R^2</em> coefficient of determination statistic using the estimate <code>sigma</code>.

</li><li><!-- /@w --> double <code>rmse</code> This contains the root mean squared error of the final residuals

</li><li><!-- /@w --> double <code>sse</code> This contains the residual sum of squares taking into account the robust covariance matrix.

</li><li><!-- /@w --> size_t <code>dof</code> This contains the number of degrees of freedom <em>n - p</em>

</li><li><!-- /@w --> size_t <code>numit</code> Upon successful convergence, this contains the number of iterations performed

</li><li><!-- /@w --> gsl_vector * <code>weights</code> This contains the final weight vector of length <var>n</var>

</li><li><!-- /@w --> gsl_vector * <code>r</code> This contains the final residual vector of length <var>n</var>, <em>r = y - X c</em>
</li></ul>
</dd></dl>

<hr>
<div class="header">
<p>
Next: <a href="Large-Dense-Linear-Systems.html#Large-Dense-Linear-Systems" accesskey="n" rel="next">Large Dense Linear Systems</a>, Previous: <a href="Regularized-regression.html#Regularized-regression" accesskey="p" rel="previous">Regularized regression</a>, Up: <a href="Least_002dSquares-Fitting.html#Least_002dSquares-Fitting" accesskey="u" rel="up">Least-Squares Fitting</a> &nbsp; [<a href="Function-Index.html#Function-Index" title="Index" rel="index">Index</a>]</p>
</div>



</body>
</html>