1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554
|
=head1 NAME
libdogleg - A general purpose optimizer to solve data fitting problems
=head1 NOTICE
If you're considering using this library for new projects, please look at the
C<ceres> solver instead:
L<http://ceres-solver.org/>
C<libdogleg> is not deprecated, and I will fix bugs as they're reported. However
C<ceres> is more feature-rich and I<much> more widely used. I consider C<ceres>
to be a superset of C<libdogleg>, and if C<ceres> was available when I wrote
C<libdogleg>, I would not have written it.
=head1 DESCRIPTION
This is a library for solving large-scale nonlinear optimization problems. By employing sparse
linear algebra, it is taylored for problems that have weak coupling between the optimization
variables. For appropriately sparse problems this results in I<massive> performance gains.
For smaller problems with dense Jacobians a dense mode is available also. This
utilizes the same optimization loop as the sparse code, but uses dense linear
algebra.
The main task of this library is to find the vector B<p> that minimizes
norm2( B<x> )
where B<x> = I<f>(B<p>) is a vector that has higher dimensionality than B<p>. The user passes in a
callback function (of type C<dogleg_callback_t>) that takes in the vector B<p> and returns the
vector B<x> and a matrix of derivatives B<J> = dB<f>/dB<p>. B<J> is a matrix with a row for each
element of I<f> and a column for each element of B<p>. If B<J> is a sparse matrix, then this library
can take advantage of that, which results in substantial increases in computational efficiency if
most entries of B<J> are 0. B<J> is stored row-first in the callback routine. libdogleg uses a
column-first data representation so it references the transpose of B<J> (called B<Jt>). B<J> stored
row-first is identical to B<Jt> stored column-first; this is purely a naming choice.
This library implements Powell's dog-leg algorithm to solve the problem. Like the more-widely-known
Levenberg-Marquardt algorithm, Powell's dog-leg algorithm solves a nonlinear optimization problem by
interpolating between a Gauss-Newton step and a gradient descent step. Improvements over LM are
=over
=item * a more natural representation of the linearity of the operating point (trust region size vs
a vague lambda term).
=item * significant efficiency gains, since a matrix inversion isn't needed to retry a rejected step
=back
The algorithm is described in many places, originally in
M. Powell. A Hybrid Method for Nonlinear Equations. In P. Rabinowitz, editor, Numerical Methods for
Nonlinear Algebraic Equations, pages 87-144. Gordon and Breach Science, London, 1970.
Various enhancements to Powell's original method are described in the literature; at this time only
the original algorithm is implemented here.
The sparse matrix algebra is handled by the CHOLMOD library, written by Tim Davis. Parts of CHOLMOD
are licensed under the GPL and parts under the LGPL. Only the LGPL pieces are used here, allowing
libdogleg to be licensed under the LGPL as well. Due to this I lose some convenience (all simple
sparse matrix arithmetic in CHOLMOD is GPL-ed) and some performance (the fancier computational
methods, such as supernodal analysis are GPL-ed). For my current applications the performance losses
are minor.
=head1 FUNCTIONS AND TYPES
=head2 Main API
=head3 dogleg_optimize2
This is the main call to the library for I<sparse> Jacobians. It's declared as
double dogleg_optimize2(double* p, unsigned int Nstate,
unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie,
const dogleg_parameters2_t* parameters,
dogleg_solverContext_t** returnContext);
=over
=item * B<p> is the initial estimate of the state vector (and holds the final result)
=item * C<Nstate> specifies the number of optimization variables (elements of B<p>)
=item * C<Nmeas> specifies the number of measurements (elements of B<x>). C<Nmeas E<gt>= Nstate> is a
requirement
=item * C<NJnnz> specifies the number of non-zero elements of the jacobian matrix dB<f>/dB<p>. In a
dense matrix C<Jnnz = Nstate*Nmeas>. We are dealing with sparse jacobians, so C<NJnnz> should be far
less. If not, libdogleg is not an appropriate routine to solve this problem.
=item * C<f> specifies the callback function that the optimization routine calls to sample the problem
being solved
=item * C<cookie> is an arbitrary data pointer passed untouched to C<f>
=item * C<parameters> a pointer to the set of parameters to use. Set to C<NULL>
to use the global parameters, or call C<dogleg_optimize> instead. See the
L<Parameters> section below for more details
=item * If not C<NULL>, C<returnContext> can be used to retrieve the full
context structure from the solver. This can be useful since this structure
contains the latest operating point values. It also has an active
C<cholmod_common> structure, which can be reused if more CHOLMOD routines need
to be called externally. You usually want C<returnContext-E<gt>beforeStep>. I<If
this data is requested, the user is required to free it with
C<dogleg_freeContext> when done>.
=back
C<dogleg_optimize> returns norm2( B<x> ) at the minimum, or a negative value if an error occurred.
=head3 dogleg_optimize
This is a flavor of C<dogleg_optimize2> that implicitly uses the global
parameters. It's declared as
double dogleg_optimize(double* p, unsigned int Nstate,
unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie,
dogleg_solverContext_t** returnContext);
=head3 dogleg_optimize_dense2
This is the main call to the library for I<dense> Jacobians. Its declared as
double dogleg_optimize_dense2(double* p, unsigned int Nstate,
unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie,
const dogleg_parameters2_t* parameters,
dogleg_solverContext_t** returnContext);
The arguments are almost identical to those in the C<dogleg_optimize> call.
=over
=item * B<p> is the initial estimate of the state vector (and holds the final result)
=item * C<Nstate> specifies the number of optimization variables (elements of B<p>)
=item * C<Nmeas> specifies the number of measurements (elements of B<x>). C<Nmeas E<gt>= Nstate> is a
requirement
=item * C<f> specifies the callback function that the optimization routine calls to sample the problem
being solved. Note that this callback has a different type from that in C<dogleg_optimize>
=item * C<cookie> is an arbitrary data pointer passed untouched to C<f>
=item * C<parameters> a pointer to the set of parameters to use. Set to C<NULL>
to use the global parameters, or call C<dogleg_optimize> instead. See the
L<Parameters> section below for more details
=item * If not C<NULL>, C<returnContext> can be used to retrieve the full
context structure from the solver. This can be useful since this structure
contains the latest operating point values. You usually want
C<returnContext-E<gt>beforeStep>. I<If this data is requested, the user is
required to free it with C<dogleg_freeContext> when done>.
=back
C<dogleg_optimize> returns norm2( B<x> ) at the minimum, or a negative value if an error occurred.
=head3 dogleg_optimize_dense
This is a flavor of C<dogleg_optimize_dense2> that implicitly uses the global
parameters. It's declared as
double dogleg_optimize_dense(double* p, unsigned int Nstate,
unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie,
dogleg_solverContext_t** returnContext);
=head3 dogleg_freeContext
Used to deallocate memory used for an optimization cycle. Defined as:
void dogleg_freeContext(dogleg_solverContext_t** ctx);
If a pointer to a context is not requested (by passing C<returnContext = NULL>
to C<dogleg_optimize>), libdogleg calls this routine automatically. If the user
I<did> retrieve this pointer, though, it must be freed with
C<dogleg_freeContext> when the user is finished.
=head3 dogleg_computeJtJfactorization
Computes the cholesky decomposition of JtJ. This function is only exposed if you
need to touch libdogleg internals via returnContext. Sometimes after computing
an optimization you want to do stuff with the factorization of JtJ, and this
call ensures that the factorization is available. Most people don't need this
function. If the comment wasn't clear, you don't need this function.
This is declared as
void dogleg_computeJtJfactorization(dogleg_operatingPoint_t* point,
dogleg_solverContext_t* ctx);
The arguments are
=over
=item * C<point> is the operating point of the system. Generally this will be
C<returnContext-E<gt>beforeStep> where C<returnContext> is from one of the
C<dogleg_optimize_...> functions.
=item * C<ctx> is the dogleg context. Generally this will be C<returnContext>
from one of the C<dogleg_optimize_...> functions
=back
=head3 dogleg_testGradient
libdogleg requires the user to compute the jacobian matrix B<J>. This is a performance optimization,
since B<J> could be computed by differences of B<x>. This optimization is often worth the extra
effort, but it creates a possibility that B<J> will have a mistake and B<J> = dB<f>/dB<p> would not
be true. To find these types of issues, the user can call
void dogleg_testGradient(unsigned int var, const double* p0,
unsigned int Nstate, unsigned int Nmeas, unsigned int NJnnz,
dogleg_callback_t* f, void* cookie);
This function computes the jacobian with center differences and compares the result with the
jacobian computed by the callback function. It is recommended to do this for every variable while
developing the program that uses libdogleg.
=over
=item * C<var> is the index of the variable being tested
=item * C<p0> is the state vector B<p> where we're evaluating the jacobian
=item * C<Nstate>, C<Nmeas>, C<NJnnz> are the number of state variables, measurements and non-zero jacobian elements, as before
=item * C<f> is the callback function, as before
=item * C<cookie> is the user data, as before
=back
This function returns nothing, but prints out the test results.
=head3 dogleg_testGradient_dense
Very similar to C<dogleg_testGradient>, but for dense jacobians.
void dogleg_testGradient_dense(unsigned int var, const double* p0,
unsigned int Nstate, unsigned int Nmeas,
dogleg_callback_dense_t* f, void* cookie);
This function computes the jacobian with center differences and compares the result with the
jacobian computed by the callback function. It is recommended to do this for every variable while
developing the program that uses libdogleg.
=over
=item * C<var> is the index of the variable being tested
=item * C<p0> is the state vector B<p> where we're evaluating the jacobian
=item * C<Nstate>, C<NJnnz> are the number of state variables, measurements
=item * C<f> is the callback function, as before
=item * C<cookie> is the user data, as before
=back
This function returns nothing, but prints out the test results.
=head3 dogleg_callback_t
The main user callback that specifies the sparse optimization problem has type
typedef void (dogleg_callback_t)(const double* p,
double* x,
cholmod_sparse* Jt,
void* cookie);
=over
=item * B<p> is the current state vector
=item * B<x> is the resulting I<f>(B<p>)
=item * B<Jt> is the transpose of dB<f>/dB<p> at B<p>. As mentioned previously, B<Jt> is stored
column-first by CHOLMOD, which can be interpreted as storing B<J> row-first by the user-defined
callback routine
=item * The C<cookie> is the user-defined arbitrary data passed into C<dogleg_optimize>.
=back
=head3 dogleg_callback_dense_t
The main user callback that specifies the dense optimization problem has type
typedef void (dogleg_callback_dense_t)(const double* p,
double* x,
double* J,
void* cookie);
=over
=item * B<p> is the current state vector
=item * B<x> is the resulting I<f>(B<p>)
=item * B<J> is dB<f>/dB<p> at B<p>. B<J> is stored row-first, with all the derivatives for the
first measurement, then all the derivatives for the second measurement and so on.
=item * The C<cookie> is the user-defined arbitrary data passed into C<dogleg_optimize>.
=back
=head3 dogleg_solverContext_t
This is the solver context that can be retrieved through the C<returnContext>
parameter of the C<dogleg_optimize> call. This structure contains I<all> the
internal state used by the solver. If requested, the user is responsible for
calling C<dogleg_freeContext> when done. This structure is defined as:
typedef struct
{
cholmod_common common;
union
{
dogleg_callback_t* f;
dogleg_callback_dense_t* f_dense;
};
void* cookie;
// between steps, beforeStep contains the operating point of the last step.
// afterStep is ONLY used while making the step. Externally, use beforeStep
// unless you really know what you're doing
dogleg_operatingPoint_t* beforeStep;
dogleg_operatingPoint_t* afterStep;
// The result of the last JtJ factorization performed. Note that JtJ is not
// necessarily factorized at every step, so this is NOT guaranteed to contain
// the factorization of the most recent JtJ
union
{
cholmod_factor* factorization;
// This is a factorization of JtJ, stored as a packed symmetric matrix
// returned by dpptrf('L', ...)
double* factorization_dense;
};
// Have I ever seen a singular JtJ? If so, I add this constant to the diagonal
// from that point on. This is a simple and fast way to deal with
// singularities. This constant starts at 0, and is increased every time a
// singular JtJ is encountered. This is suboptimal but works for me for now.
double lambda;
// Are we using sparse math (cholmod)?
int is_sparse;
int Nstate, Nmeasurements;
} dogleg_solverContext_t;
Some of the members are copies of the data passed into C<dogleg_optimize>; some
others are internal state. Of potential interest are
=over
=item * C<common> is a cholmod_common structure used by all CHOLMOD calls. This
can be used for any extra CHOLMOD work the user may want to do
=item * C<beforeStep> contains the operating point of the optimum solution. The
user can analyze this data without the need to re-call the callback routine.
=back
=head3 dogleg_operatingPoint_t
An operating point of the solver. This is a part of C<dogleg_solverContext_t>.
Various variables describing the operating point such as B<p>, B<J>, B<x>,
B<norm2(x)> and B<Jt x> are available. All of the just-mentioned variables are
computed at every step and are thus always valid.
// an operating point of the solver
typedef struct
{
double* p;
double* x;
double norm2_x;
union
{
cholmod_sparse* Jt;
double* J_dense; // row-first: grad0, grad1, grad2, ...
};
double* Jt_x;
// the cached update vectors. It's useful to cache these so that when a step is rejected, we can
// reuse these when we retry
double* updateCauchy;
union
{
cholmod_dense* updateGN_cholmoddense;
double* updateGN_dense;
};
double updateCauchy_lensq, updateGN_lensq; // update vector lengths
// whether the current update vectors are correct or not
int updateCauchy_valid, updateGN_valid;
int didStepToEdgeOfTrustRegion;
} dogleg_operatingPoint_t;
=head2 Parameters
The optimization is controlled by several parameters. These can be set globally
for I<all> callers of C<libdogleg> in a process using the C<dogleg_set....()>
functions. Those global values are used by C<dogleg_optimize> and
C<dogleg_optimize_dense>. Or these can be specified independently for each
invocation by passing a C<parameters> argument to C<dogleg_optimize2> or
C<dogleg_optimize_dense2>. The latter is recommended because multiple instances
of libdogleg in a single application would no longer conflict.
It is not required to set any of these, but it's highly recommended to set the
initial trust-region size and the termination thresholds to match the problem
being solved. Furthermore, it's highly recommended for the problem being solved
to be scaled so that every state variable affects the objective norm2( B<x> )
equally. This makes this method's concept of "trust region" much more
well-defined and makes the termination criteria work correctly.
=head3 dogleg_setMaxIterations
To set the maximum number of solver iterations, call
void dogleg_setMaxIterations(int n);
=head3 dogleg_setDebug
To turn on diagnostic output, call
void dogleg_setDebug(int debug);
with a non-zero value for C<debug>. Two separate diagnostic streams are
available: a verbose human-oriented stream, and a
L<vnlog|http://github.com/dkogan/vnlog>.
By default, diagnostic output is disabled. The C<debug> argument is a bit-mapped
integer:
if(debug == 0 ): no diagnostic output
if(debug & DOGLEG_DEBUG_VNLOG): output vnlog diagnostics to stdout
if(debug & ~DOGLEG_DEBUG_VNLOG): output human-oriented diagnostics to stderr
C<DOGLEG_DEBUG_VNLOG> has a very high value, so if human diagnostics are
desired, the recommended call is:
dogleg_setDebug(1);
=head3 dogleg_setInitialTrustregion
The optimization method keeps track of a trust region size. Here, the trust region is a ball in
R^Nstate. When the method takes a step B<p> -> B<p + delta_p>, it makes sure that
S<sqrt( norm2( B<delta_p> ) ) < trust region size>.
The initial value of the trust region size can be set with
void dogleg_setInitialTrustregion(double t);
The dogleg algorithm is efficient when recomputing a rejected step for a smaller trust region, so
set the initial trust region size to a value larger to a reasonable estimate; the method will
quickly shrink the trust region to the correct size.
=head3 dogleg_setThresholds
The routine exits when the maximum number of iterations is exceeded, or a termination threshold is
hit, whichever happens first. The termination thresholds are all designed to trigger when very slow
progress is being made. If all went well, this slow progress is due to us finding the optimum. There
are 3 termination thresholds:
=over
=item * The function being minimized is E = norm2( B<x> ) where B<x> = I<f>(B<p>).
dE/dB<p> = 2*B<Jt>*B<x> where B<Jt> is transpose(dB<x>/dB<p>).
if( for every i fabs(Jt_x[i]) < JT_X_THRESHOLD )
{ we are done; }
=item * The method takes discrete steps: B<p> -> B<p + delta_p>
if( for every i fabs(delta_p[i]) < UPDATE_THRESHOLD)
{ we are done; }
=item * The method dynamically controls the trust region.
if(trustregion < TRUSTREGION_THRESHOLD)
{ we are done; }
=back
To set these threholds, call
void dogleg_setThresholds(double Jt_x, double update, double trustregion);
To leave a particular threshold alone, specify a negative value.
=head3 dogleg_setTrustregionUpdateParameters
This function sets the parameters that control when and how the trust region is updated. The default
values should work well in most cases, and shouldn't need to be tweaked.
Declaration looks like
void dogleg_setTrustregionUpdateParameters(double downFactor, double downThreshold,
double upFactor, double upThreshold);
To see what the parameters do, look at C<evaluateStep_adjustTrustRegion> in the source. Again, these
should just work as is.
=head1 BUGS
The current implementation doesn't handle a singular B<JtJ> gracefully (B<JtJ> =
B<Jt> * B<J>). Analytically, B<JtJ> is at worst positive semi-definite (has 0 eigenvalues). If a
singular B<JtJ> is ever encountered, from that point on, B<JtJ> + lambda*B<I> is inverted instead
for some positive constant lambda. This makes the matrix strictly positive definite, but is
sloppy. At least I should vary lambda. In my current applications, a singular B<JtJ> only occurs if
at a particular operating point the vector B<x> has no dependence at all on some elements of
B<p>. In the general case other causes could exist, though.
There's an inefficiency in that the callback always returns B<x> and B<J>. When I evaluate and
reject a step, I do not end up using B<J> at all. Dependng on the callback function, it may be
better to ask for B<x> and then, if the step is accepted, to ask for B<J>.
=head1 AUTHOR
Dima Kogan, C<< <dima@secretsauce.net> >>
=head1 LICENSE AND COPYRIGHT
Copyright 2011 Oblong Industries
2017 Dima Kogan <dima@secretsauce.net>
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
Lesser General Public License as published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
The full text of the license is available at L<http://www.gnu.org/licenses>
|