1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 The GSL Team.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being "GNU General Public License" and "Free Software
Needs Free Documentation", the Front-Cover text being "A GNU Manual",
and with the Back-Cover Text being (a) (see below). A copy of the
license is included in the section entitled "GNU Free Documentation
License".
(a) The Back-Cover Text is: "You have the freedom to copy and modify this
GNU Manual." -->
<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ -->
<head>
<title>GNU Scientific Library – Reference Manual: Nonlinear Least-Squares Overview</title>
<meta name="description" content="GNU Scientific Library – Reference Manual: Nonlinear Least-Squares Overview">
<meta name="keywords" content="GNU Scientific Library – Reference Manual: Nonlinear Least-Squares Overview">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link href="index.html#Top" rel="start" title="Top">
<link href="Function-Index.html#Function-Index" rel="index" title="Function Index">
<link href="Nonlinear-Least_002dSquares-Fitting.html#Nonlinear-Least_002dSquares-Fitting" rel="up" title="Nonlinear Least-Squares Fitting">
<link href="Nonlinear-Least_002dSquares-TRS-Overview.html#Nonlinear-Least_002dSquares-TRS-Overview" rel="next" title="Nonlinear Least-Squares TRS Overview">
<link href="Nonlinear-Least_002dSquares-Fitting.html#Nonlinear-Least_002dSquares-Fitting" rel="previous" title="Nonlinear Least-Squares Fitting">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.indentedblock {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
div.smalllisp {margin-left: 3.2em}
kbd {font-style:oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nocodebreak {white-space:nowrap}
span.nolinebreak {white-space:nowrap}
span.roman {font-family:serif; font-weight:normal}
span.sansserif {font-family:sans-serif; font-weight:normal}
ul.no-bullet {list-style: none}
-->
</style>
</head>
<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
<a name="Nonlinear-Least_002dSquares-Overview"></a>
<div class="header">
<p>
Next: <a href="Nonlinear-Least_002dSquares-TRS-Overview.html#Nonlinear-Least_002dSquares-TRS-Overview" accesskey="n" rel="next">Nonlinear Least-Squares TRS Overview</a>, Up: <a href="Nonlinear-Least_002dSquares-Fitting.html#Nonlinear-Least_002dSquares-Fitting" accesskey="u" rel="up">Nonlinear Least-Squares Fitting</a> [<a href="Function-Index.html#Function-Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Overview-5"></a>
<h3 class="section">39.1 Overview</h3>
<a name="index-nonlinear-least-squares_002c-overview"></a>
<p>The problem of multidimensional nonlinear least-squares fitting requires
the minimization of the squared residuals of <em>n</em> functions,
<em>f_i</em>, in <em>p</em> parameters, <em>x_i</em>,
</p>
<div class="example">
<pre class="example">\Phi(x) = (1/2) || f(x) ||^2
= (1/2) \sum_{i=1}^{n} f_i(x_1, ..., x_p)^2
</pre></div>
<p>In trust region methods, the objective (or cost) function <em>\Phi(x)</em> is approximated
by a model function <em>m_k(\delta)</em> in the vicinity of some point <em>x_k</em>. The
model function is often simply a second order Taylor series expansion around the
point <em>x_k</em>, ie:
</p>
<div class="example">
<pre class="example">\Phi(x_k + \delta) ~=~ m_k(\delta) = \Phi(x_k) + g_k^T \delta + 1/2 \delta^T B_k \delta
</pre></div>
<p>where <em>g_k = \nabla \Phi(x_k) = J^T f</em> is the gradient vector at the point <em>x_k</em>,
<em>B_k = \nabla^2 \Phi(x_k)</em> is the Hessian matrix at <em>x_k</em>, or some
approximation to it, and <em>J</em> is the <em>n</em>-by-<em>p</em> Jacobian matrix
<em>J_{ij} = d f_i / d x_j</em>.
In order to find the next step <em>\delta</em>, we minimize the model function
<em>m_k(\delta)</em>, but search for solutions only within a region where
we trust that <em>m_k(\delta)</em> is a good approximation to the objective
function <em>\Phi(x_k + \delta)</em>. In other words,
we seek a solution of the trust region subproblem (TRS)
</p>
<div class="example">
<pre class="example">\min_(\delta \in R^p) m_k(\delta), s.t. || D_k \delta || <= \Delta_k
</pre></div>
<p>where <em>\Delta_k > 0</em> is the trust region radius and <em>D_k</em> is
a scaling matrix. If <em>D_k = I</em>, then the trust region is a ball
of radius <em>\Delta_k</em> centered at <em>x_k</em>. In some applications,
the parameter vector <em>x</em> may have widely different scales. For
example, one parameter might be a temperature on the order of
<em>10^3</em> K, while another might be a length on the order of
<em>10^{-6}</em> m. In such cases, a spherical trust region may not
be the best choice, since if <em>\Phi</em> changes rapidly along
directions with one scale, and more slowly along directions with
a different scale, the model function <em>m_k</em> may be a poor
approximation to <em>\Phi</em> along the rapidly changing directions.
In such problems, it may be best to use an elliptical trust region,
by setting <em>D_k</em> to a diagonal matrix whose entries are designed
so that the scaled step <em>D_k \delta</em> has entries of approximately the same
order of magnitude.
</p>
<p>The trust region subproblem above normally amounts to solving a
linear least squares system (or multiple systems) for the step
<em>\delta</em>. Once <em>\delta</em> is computed, it is checked whether
or not it reduces the objective function <em>\Phi(x)</em>. A useful
statistic for this is to look at the ratio
</p>
<div class="example">
<pre class="example">\rho_k = ( \Phi(x_k) - \Phi(x_k + \delta_k) / ( m_k(0) - m_k(\delta_k) )
</pre></div>
<p>where the numerator is the actual reduction of the objective function
due to the step <em>\delta_k</em>, and the denominator is the predicted
reduction due to the model <em>m_k</em>. If <em>\rho_k</em> is negative,
it means that the step <em>\delta_k</em> increased the objective function
and so it is rejected. If <em>\rho_k</em> is positive,
then we have found a step which reduced the objective function and
it is accepted. Furthermore, if <em>\rho_k</em> is close to 1,
then this indicates that the model function is a good approximation
to the objective function in the trust region, and so on the next
iteration the trust region is enlarged in order to take more ambitious
steps. When a step is rejected, the trust region is made smaller and
the TRS is solved again. An outline for the general trust region method
used by GSL can now be given.
</p>
<p><b>Trust Region Algorithm</b>
</p><ol>
<li> Initialize: given <em>x_0</em>, construct <em>m_0(\delta)</em>, <em>D_0</em> and <em>\Delta_0 > 0</em>
</li><li> For k = 0, 1, 2, ...
<ol>
<li> If converged, then stop
</li><li> Solve TRS for trial step <em>\delta_k</em>
</li><li> Evaluate trial step by computing <em>\rho_k</em>
<ol>
<li> if step is accepted, set <em>x_{k+1} = x_k + \delta_k</em> and increase radius, <em>\Delta_{k+1} = \alpha \Delta_k</em>
</li><li> if step is rejected, set <em>x_{k+1} = x_k</em> and decrease radius, <em>\Delta_{k+1} = {\Delta_k \over \beta}</em>; goto 2(b)
</li></ol>
</li><li> Construct <em>m_{k+1}(\delta)</em> and <em>D_{k+1}</em>
</li></ol>
</li></ol>
<p>GSL offers the user a number of different algorithms for solving the trust
region subproblem in 2(b), as well as different choices of scaling matrices
<em>D_k</em> and different methods of updating the trust region radius
<em>\Delta_k</em>. Therefore, while reasonable default methods are provided,
the user has a lot of control to fine-tune the various steps of the
algorithm for their specific problem.
</p>
<hr>
<div class="header">
<p>
Next: <a href="Nonlinear-Least_002dSquares-TRS-Overview.html#Nonlinear-Least_002dSquares-TRS-Overview" accesskey="n" rel="next">Nonlinear Least-Squares TRS Overview</a>, Up: <a href="Nonlinear-Least_002dSquares-Fitting.html#Nonlinear-Least_002dSquares-Fitting" accesskey="u" rel="up">Nonlinear Least-Squares Fitting</a> [<a href="Function-Index.html#Function-Index" title="Index" rel="index">Index</a>]</p>
</div>
</body>
</html>
|