File: gradient_solver.rst

package info (click to toggle)
ceres-solver 2.1.0%2Breally2.1.0%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 13,656 kB
  • sloc: cpp: 80,895; ansic: 2,869; python: 679; sh: 78; makefile: 74; xml: 21
file content (542 lines) | stat: -rw-r--r-- 20,002 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
.. default-domain:: cpp

.. highlight:: c++

.. cpp:namespace:: ceres

.. _chapter-gradient_problem_solver:

==================================
General Unconstrained Minimization
==================================

Modeling
========

:class:`FirstOrderFunction`
---------------------------

.. class:: FirstOrderFunction

  Instances of :class:`FirstOrderFunction` implement the evaluation of
  a function and its gradient.

  .. code-block:: c++

   class FirstOrderFunction {
     public:
      virtual ~FirstOrderFunction() {}
      virtual bool Evaluate(const double* const parameters,
                            double* cost,
                            double* gradient) const = 0;
      virtual int NumParameters() const = 0;
   };

.. function:: bool FirstOrderFunction::Evaluate(const double* const parameters, double* cost, double* gradient) const

   Evaluate the cost/value of the function. If ``gradient`` is not
   ``nullptr`` then evaluate the gradient too. If evaluation is
   successful return, ``true`` else return ``false``.

   ``cost`` guaranteed to be never ``nullptr``, ``gradient`` can be ``nullptr``.

.. function:: int FirstOrderFunction::NumParameters() const

   Number of parameters in the domain of the function.


:class:`GradientProblem`
------------------------

.. NOTE::

   The :class:`LocalParameterization` interface and associated classes
   are deprecated. They will be removed in the version 2.2.0. Please use
   :class:`Manifold` based constructor instead.

.. class:: GradientProblem

.. code-block:: c++

  class GradientProblem {
   public:
    explicit GradientProblem(FirstOrderFunction* function);
    GradientProblem(FirstOrderFunction* function,
                    LocalParameterization* parameterization);
    GradientProblem(FirstOrderFunction* function,
                    Manifold* manifold);
    int NumParameters() const;
    int NumLocalParameters() const { return NumTangentParameters(); }
    int NumTangentParameters() const;
    bool Evaluate(const double* parameters, double* cost, double* gradient) const;
    bool Plus(const double* x, const double* delta, double* x_plus_delta) const;
  };

Instances of :class:`GradientProblem` represent general non-linear
optimization problems that must be solved using just the value of the
objective function and its gradient. Unlike the :class:`Problem`
class, which can only be used to model non-linear least squares
problems, instances of :class:`GradientProblem` not restricted in the
form of the objective function.

Structurally :class:`GradientProblem` is a composition of a
:class:`FirstOrderFunction` and optionally a
:class:`LocalParameterization` or a :class:`Manifold`.

The :class:`FirstOrderFunction` is responsible for evaluating the cost
and gradient of the objective function.

The :class:`LocalParameterization`/:class:`Manifold` is responsible
for going back and forth between the ambient space and the local
tangent space. When a :class:`LocalParameterization` or a
:class:`Manifold` is not provided, then the tangent space is assumed
to coincide with the ambient Euclidean space that the gradient vector
lives in.

The constructor takes ownership of the :class:`FirstOrderFunction` and
:class:`LocalParameterization` or :class:`Manifold` objects passed to
it.


.. function:: void Solve(const GradientProblemSolver::Options& options, const GradientProblem& problem, double* parameters, GradientProblemSolver::Summary* summary)

   Solve the given :class:`GradientProblem` using the values in
   ``parameters`` as the initial guess of the solution.


Solving
=======

:class:`GradientProblemSolver::Options`
---------------------------------------

.. class:: GradientProblemSolver::Options

   :class:`GradientProblemSolver::Options` controls the overall
   behavior of the solver. We list the various settings and their
   default values below.

.. function:: bool GradientProblemSolver::Options::IsValid(string* error) const

   Validate the values in the options struct and returns true on
   success. If there is a problem, the method returns false with
   ``error`` containing a textual description of the cause.

.. member:: LineSearchDirectionType GradientProblemSolver::Options::line_search_direction_type

   Default: ``LBFGS``

   Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
   ``BFGS`` and ``LBFGS``.

.. member:: LineSearchType GradientProblemSolver::Options::line_search_type

   Default: ``WOLFE``

   Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
   Note that in order for the assumptions underlying the ``BFGS`` and
   ``LBFGS`` line search direction algorithms to be guaranteed to be
   satisifed, the ``WOLFE`` line search should be used.

.. member:: NonlinearConjugateGradientType GradientProblemSolver::Options::nonlinear_conjugate_gradient_type

   Default: ``FLETCHER_REEVES``

   Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
   ``HESTENES_STIEFEL``.

.. member:: int GradientProblemSolver::Options::max_lbfs_rank

   Default: 20

   The L-BFGS hessian approximation is a low rank approximation to the
   inverse of the Hessian matrix. The rank of the approximation
   determines (linearly) the space and time complexity of using the
   approximation. Higher the rank, the better is the quality of the
   approximation. The increase in quality is however is bounded for a
   number of reasons.

     1. The method only uses secant information and not actual
        derivatives.

     2. The Hessian approximation is constrained to be positive
        definite.

   So increasing this rank to a large number will cost time and space
   complexity without the corresponding increase in solution
   quality. There are no hard and fast rules for choosing the maximum
   rank. The best choice usually requires some problem specific
   experimentation.

.. member:: bool GradientProblemSolver::Options::use_approximate_eigenvalue_bfgs_scaling

   Default: ``false``

   As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
   step, the initial inverse Hessian approximation is taken to be the
   Identity.  However, [Oren]_ showed that using instead :math:`I *
   \gamma`, where :math:`\gamma` is a scalar chosen to approximate an
   eigenvalue of the true inverse Hessian can result in improved
   convergence in a wide variety of cases.  Setting
   ``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
   scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
   iteration).

   Precisely, approximate eigenvalue scaling equates to

   .. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}

   With:

  .. math:: y_k = \nabla f_{k+1} - \nabla f_k
  .. math:: s_k = x_{k+1} - x_k

  Where :math:`f()` is the line search objective and :math:`x` the
  vector of parameter values [NocedalWright]_.

  It is important to note that approximate eigenvalue scaling does
  **not** *always* improve convergence, and that it can in fact
  *significantly* degrade performance for certain classes of problem,
  which is why it is disabled by default.  In particular it can
  degrade performance when the sensitivity of the problem to different
  parameters varies significantly, as in this case a single scalar
  factor fails to capture this variation and detrimentally downscales
  parts of the Jacobian approximation which correspond to
  low-sensitivity parameters. It can also reduce the robustness of the
  solution to errors in the Jacobians.

.. member:: LineSearchIterpolationType GradientProblemSolver::Options::line_search_interpolation_type

   Default: ``CUBIC``

   Degree of the polynomial used to approximate the objective
   function. Valid values are ``BISECTION``, ``QUADRATIC`` and
   ``CUBIC``.

.. member:: double GradientProblemSolver::Options::min_line_search_step_size

   The line search terminates if:

   .. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size}

   where :math:`\|\cdot\|_\infty` refers to the max norm, and
   :math:`\Delta x_k` is the step change in the parameter values at
   the :math:`k`-th iteration.

.. member:: double GradientProblemSolver::Options::line_search_sufficient_function_decrease

   Default: ``1e-4``

   Solving the line search problem exactly is computationally
   prohibitive. Fortunately, line search based optimization algorithms
   can still guarantee convergence if instead of an exact solution,
   the line search algorithm returns a solution which decreases the
   value of the objective function sufficiently. More precisely, we
   are looking for a step size s.t.

   .. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]

   This condition is known as the Armijo condition.

.. member:: double GradientProblemSolver::Options::max_line_search_step_contraction

   Default: ``1e-3``

   In each iteration of the line search,

   .. math:: \text{new_step_size} \geq \text{max_line_search_step_contraction} * \text{step_size}

   Note that by definition, for contraction:

   .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1

.. member:: double GradientProblemSolver::Options::min_line_search_step_contraction

   Default: ``0.6``

   In each iteration of the line search,

   .. math:: \text{new_step_size} \leq \text{min_line_search_step_contraction} * \text{step_size}

   Note that by definition, for contraction:

   .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1

.. member:: int GradientProblemSolver::Options::max_num_line_search_step_size_iterations

   Default: ``20``

   Maximum number of trial step size iterations during each line
   search, if a step size satisfying the search conditions cannot be
   found within this number of trials, the line search will stop.

   As this is an 'artificial' constraint (one imposed by the user, not
   the underlying math), if ``WOLFE`` line search is being used, *and*
   points satisfying the Armijo sufficient (function) decrease
   condition have been found during the current search (in :math:`\leq`
   ``max_num_line_search_step_size_iterations``).  Then, the step size
   with the lowest function value which satisfies the Armijo condition
   will be returned as the new valid step, even though it does *not*
   satisfy the strong Wolfe conditions.  This behaviour protects
   against early termination of the optimizer at a sub-optimal point.

.. member:: int GradientProblemSolver::Options::max_num_line_search_direction_restarts

   Default: ``5``

   Maximum number of restarts of the line search direction algorithm
   before terminating the optimization. Restarts of the line search
   direction algorithm occur when the current algorithm fails to
   produce a new descent direction. This typically indicates a
   numerical failure, or a breakdown in the validity of the
   approximations used.

.. member:: double GradientProblemSolver::Options::line_search_sufficient_curvature_decrease

   Default: ``0.9``

   The strong Wolfe conditions consist of the Armijo sufficient
   decrease condition, and an additional requirement that the
   step size be chosen s.t. the *magnitude* ('strong' Wolfe
   conditions) of the gradient along the search direction
   decreases sufficiently. Precisely, this second condition
   is that we seek a step size s.t.

   .. math:: \|f'(\text{step_size})\| \leq \text{sufficient_curvature_decrease} * \|f'(0)\|

   Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
   of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.

.. member:: double GradientProblemSolver::Options::max_line_search_step_expansion

   Default: ``10.0``

   During the bracketing phase of a Wolfe line search, the step size
   is increased until either a point satisfying the Wolfe conditions
   is found, or an upper bound for a bracket containing a point
   satisfying the conditions is found.  Precisely, at each iteration
   of the expansion:

   .. math:: \text{new_step_size} \leq \text{max_step_expansion} * \text{step_size}

   By definition for expansion

   .. math:: \text{max_step_expansion} > 1.0

.. member:: int GradientProblemSolver::Options::max_num_iterations

   Default: ``50``

   Maximum number of iterations for which the solver should run.

.. member:: double GradientProblemSolver::Options::max_solver_time_in_seconds

   Default: ``1e6``
   Maximum amount of time for which the solver should run.

.. member:: double GradientProblemSolver::Options::function_tolerance

   Default: ``1e-6``

   Solver terminates if

   .. math:: \frac{|\Delta \text{cost}|}{\text{cost}} \leq \text{function_tolerance}

   where, :math:`\Delta \text{cost}` is the change in objective
   function value (up or down) in the current iteration of the line search.

.. member:: double GradientProblemSolver::Options::gradient_tolerance

   Default: ``1e-10``

   Solver terminates if

   .. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty \leq \text{gradient_tolerance}

   where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi`
   is projection onto the bounds constraints and :math:`\boxplus` is
   Plus operation for the manifold associated with the parameter
   vector.

.. member:: double GradientProblemSolver::Options::parameter_tolerance

   Default: ``1e-8``

   Solver terminates if

   .. math:: \|\Delta x\| \leq (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}

   where :math:`\Delta x` is the step computed by the linear solver in
   the current iteration of the line search.

.. member:: LoggingType GradientProblemSolver::Options::logging_type

   Default: ``PER_MINIMIZER_ITERATION``

.. member:: bool GradientProblemSolver::Options::minimizer_progress_to_stdout

   Default: ``false``

   By default the :class:`Minimizer` progress is logged to ``STDERR``
   depending on the ``vlog`` level. If this flag is set to true, and
   :member:`GradientProblemSolver::Options::logging_type` is not
   ``SILENT``, the logging output is sent to ``STDOUT``.

   The progress display looks like

   .. code-block:: bash

      0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e:  0 it: 2.98e-02 tt: 8.50e-02
      1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e:  1 it: 4.54e-02 tt: 1.31e-01
      2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e:  1 it: 4.96e-02 tt: 1.81e-01

   Here

   #. ``f`` is the value of the objective function.
   #. ``d`` is the change in the value of the objective function if
      the step computed in this iteration is accepted.
   #. ``g`` is the max norm of the gradient.
   #. ``h`` is the change in the parameter vector.
   #. ``s`` is the optimal step length computed by the line search.
   #. ``it`` is the time take by the current iteration.
   #. ``tt`` is the total time taken by the minimizer.

.. member:: vector<IterationCallback> GradientProblemSolver::Options::callbacks

   Callbacks that are executed at the end of each iteration of the
   :class:`Minimizer`. They are executed in the order that they are
   specified in this vector. By default, parameter blocks are updated
   only at the end of the optimization, i.e., when the
   :class:`Minimizer` terminates. This behavior is controlled by
   :member:`GradientProblemSolver::Options::update_state_every_variable`. If
   the user wishes to have access to the update parameter blocks when
   his/her callbacks are executed, then set
   :member:`GradientProblemSolver::Options::update_state_every_iteration`
   to true.

   The solver does NOT take ownership of these pointers.


.. member:: bool Solver::Options::update_state_every_iteration

   Default: ``false``

   Normally the parameter vector is only updated when the solver
   terminates. Setting this to true updates it every iteration. This
   setting is useful when building an interactive application using
   Ceres and using an :class:`IterationCallback`.

:class:`GradientProblemSolver::Summary`
---------------------------------------

.. class:: GradientProblemSolver::Summary

   Summary of the various stages of the solver after termination.

.. function:: string GradientProblemSolver::Summary::BriefReport() const

   A brief one line description of the state of the solver after
   termination.

.. function:: string GradientProblemSolver::Summary::FullReport() const

   A full multiline description of the state of the solver after
   termination.

.. function:: bool GradientProblemSolver::Summary::IsSolutionUsable() const

   Whether the solution returned by the optimization algorithm can be
   relied on to be numerically sane. This will be the case if
   `GradientProblemSolver::Summary:termination_type` is set to `CONVERGENCE`,
   `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
   converged by meeting one of the convergence tolerances or because
   the user indicated that it had converged or it ran to the maximum
   number of iterations or time.

.. member:: TerminationType GradientProblemSolver::Summary::termination_type

   The cause of the minimizer terminating.

.. member:: string GradientProblemSolver::Summary::message

   Reason why the solver terminated.

.. member:: double GradientProblemSolver::Summary::initial_cost

   Cost of the problem (value of the objective function) before the
   optimization.

.. member:: double GradientProblemSolver::Summary::final_cost

   Cost of the problem (value of the objective function) after the
   optimization.

.. member:: vector<IterationSummary> GradientProblemSolver::Summary::iterations

   :class:`IterationSummary` for each minimizer iteration in order.

.. member:: int num_cost_evaluations

   Number of times the cost (and not the gradient) was evaluated.

.. member:: int num_gradient_evaluations

   Number of times the gradient (and the cost) were evaluated.

.. member:: double GradientProblemSolver::Summary::total_time_in_seconds

   Time (in seconds) spent in the solver.

.. member:: double GradientProblemSolver::Summary::cost_evaluation_time_in_seconds

   Time (in seconds) spent evaluating the cost vector.

.. member:: double GradientProblemSolver::Summary::gradient_evaluation_time_in_seconds

   Time (in seconds) spent evaluating the gradient vector.

.. member:: int GradientProblemSolver::Summary::num_parameters

   Number of parameters in the problem.

.. member:: int GradientProblemSolver::Summary::num_local_parameters

   Dimension of the tangent space of the problem. This is different
   from :member:`GradientProblemSolver::Summary::num_parameters` if a
   :class:`LocalParameterization`/:class:`Manifold` object is used.

   .. NOTE::

      ``num_local_parameters`` is deprecated and will be removed in
      Ceres Solver version 2.2.0. Please use ``num_tangent_parameters``
      instead.

.. member:: int GradientProblemSolver::Summary::num_tangent_parameters

   Dimension of the tangent space of the problem. This is different
   from :member:`GradientProblemSolver::Summary::num_parameters` if a
   :class:`LocalParameterization`/:class:`Manifold` object is used.

.. member:: LineSearchDirectionType GradientProblemSolver::Summary::line_search_direction_type

   Type of line search direction used.

.. member:: LineSearchType GradientProblemSolver::Summary::line_search_type

   Type of the line search algorithm used.

.. member:: LineSearchInterpolationType GradientProblemSolver::Summary::line_search_interpolation_type

   When performing line search, the degree of the polynomial used to
   approximate the objective function.

.. member:: NonlinearConjugateGradientType GradientProblemSolver::Summary::nonlinear_conjugate_gradient_type

   If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
   then this indicates the particular variant of non-linear conjugate
   gradient used.

.. member:: int GradientProblemSolver::Summary::max_lbfgs_rank

   If the type of the line search direction is `LBFGS`, then this
   indicates the rank of the Hessian approximation.