1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542
|
.. default-domain:: cpp
.. highlight:: c++
.. cpp:namespace:: ceres
.. _chapter-gradient_problem_solver:
==================================
General Unconstrained Minimization
==================================
Modeling
========
:class:`FirstOrderFunction`
---------------------------
.. class:: FirstOrderFunction
Instances of :class:`FirstOrderFunction` implement the evaluation of
a function and its gradient.
.. code-block:: c++
class FirstOrderFunction {
public:
virtual ~FirstOrderFunction() {}
virtual bool Evaluate(const double* const parameters,
double* cost,
double* gradient) const = 0;
virtual int NumParameters() const = 0;
};
.. function:: bool FirstOrderFunction::Evaluate(const double* const parameters, double* cost, double* gradient) const
Evaluate the cost/value of the function. If ``gradient`` is not
``nullptr`` then evaluate the gradient too. If evaluation is
successful return, ``true`` else return ``false``.
``cost`` guaranteed to be never ``nullptr``, ``gradient`` can be ``nullptr``.
.. function:: int FirstOrderFunction::NumParameters() const
Number of parameters in the domain of the function.
:class:`GradientProblem`
------------------------
.. NOTE::
The :class:`LocalParameterization` interface and associated classes
are deprecated. They will be removed in the version 2.2.0. Please use
:class:`Manifold` based constructor instead.
.. class:: GradientProblem
.. code-block:: c++
class GradientProblem {
public:
explicit GradientProblem(FirstOrderFunction* function);
GradientProblem(FirstOrderFunction* function,
LocalParameterization* parameterization);
GradientProblem(FirstOrderFunction* function,
Manifold* manifold);
int NumParameters() const;
int NumLocalParameters() const { return NumTangentParameters(); }
int NumTangentParameters() const;
bool Evaluate(const double* parameters, double* cost, double* gradient) const;
bool Plus(const double* x, const double* delta, double* x_plus_delta) const;
};
Instances of :class:`GradientProblem` represent general non-linear
optimization problems that must be solved using just the value of the
objective function and its gradient. Unlike the :class:`Problem`
class, which can only be used to model non-linear least squares
problems, instances of :class:`GradientProblem` not restricted in the
form of the objective function.
Structurally :class:`GradientProblem` is a composition of a
:class:`FirstOrderFunction` and optionally a
:class:`LocalParameterization` or a :class:`Manifold`.
The :class:`FirstOrderFunction` is responsible for evaluating the cost
and gradient of the objective function.
The :class:`LocalParameterization`/:class:`Manifold` is responsible
for going back and forth between the ambient space and the local
tangent space. When a :class:`LocalParameterization` or a
:class:`Manifold` is not provided, then the tangent space is assumed
to coincide with the ambient Euclidean space that the gradient vector
lives in.
The constructor takes ownership of the :class:`FirstOrderFunction` and
:class:`LocalParameterization` or :class:`Manifold` objects passed to
it.
.. function:: void Solve(const GradientProblemSolver::Options& options, const GradientProblem& problem, double* parameters, GradientProblemSolver::Summary* summary)
Solve the given :class:`GradientProblem` using the values in
``parameters`` as the initial guess of the solution.
Solving
=======
:class:`GradientProblemSolver::Options`
---------------------------------------
.. class:: GradientProblemSolver::Options
:class:`GradientProblemSolver::Options` controls the overall
behavior of the solver. We list the various settings and their
default values below.
.. function:: bool GradientProblemSolver::Options::IsValid(string* error) const
Validate the values in the options struct and returns true on
success. If there is a problem, the method returns false with
``error`` containing a textual description of the cause.
.. member:: LineSearchDirectionType GradientProblemSolver::Options::line_search_direction_type
Default: ``LBFGS``
Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``,
``BFGS`` and ``LBFGS``.
.. member:: LineSearchType GradientProblemSolver::Options::line_search_type
Default: ``WOLFE``
Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions).
Note that in order for the assumptions underlying the ``BFGS`` and
``LBFGS`` line search direction algorithms to be guaranteed to be
satisifed, the ``WOLFE`` line search should be used.
.. member:: NonlinearConjugateGradientType GradientProblemSolver::Options::nonlinear_conjugate_gradient_type
Default: ``FLETCHER_REEVES``
Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and
``HESTENES_STIEFEL``.
.. member:: int GradientProblemSolver::Options::max_lbfs_rank
Default: 20
The L-BFGS hessian approximation is a low rank approximation to the
inverse of the Hessian matrix. The rank of the approximation
determines (linearly) the space and time complexity of using the
approximation. Higher the rank, the better is the quality of the
approximation. The increase in quality is however is bounded for a
number of reasons.
1. The method only uses secant information and not actual
derivatives.
2. The Hessian approximation is constrained to be positive
definite.
So increasing this rank to a large number will cost time and space
complexity without the corresponding increase in solution
quality. There are no hard and fast rules for choosing the maximum
rank. The best choice usually requires some problem specific
experimentation.
.. member:: bool GradientProblemSolver::Options::use_approximate_eigenvalue_bfgs_scaling
Default: ``false``
As part of the ``BFGS`` update step / ``LBFGS`` right-multiply
step, the initial inverse Hessian approximation is taken to be the
Identity. However, [Oren]_ showed that using instead :math:`I *
\gamma`, where :math:`\gamma` is a scalar chosen to approximate an
eigenvalue of the true inverse Hessian can result in improved
convergence in a wide variety of cases. Setting
``use_approximate_eigenvalue_bfgs_scaling`` to true enables this
scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each
iteration).
Precisely, approximate eigenvalue scaling equates to
.. math:: \gamma = \frac{y_k' s_k}{y_k' y_k}
With:
.. math:: y_k = \nabla f_{k+1} - \nabla f_k
.. math:: s_k = x_{k+1} - x_k
Where :math:`f()` is the line search objective and :math:`x` the
vector of parameter values [NocedalWright]_.
It is important to note that approximate eigenvalue scaling does
**not** *always* improve convergence, and that it can in fact
*significantly* degrade performance for certain classes of problem,
which is why it is disabled by default. In particular it can
degrade performance when the sensitivity of the problem to different
parameters varies significantly, as in this case a single scalar
factor fails to capture this variation and detrimentally downscales
parts of the Jacobian approximation which correspond to
low-sensitivity parameters. It can also reduce the robustness of the
solution to errors in the Jacobians.
.. member:: LineSearchIterpolationType GradientProblemSolver::Options::line_search_interpolation_type
Default: ``CUBIC``
Degree of the polynomial used to approximate the objective
function. Valid values are ``BISECTION``, ``QUADRATIC`` and
``CUBIC``.
.. member:: double GradientProblemSolver::Options::min_line_search_step_size
The line search terminates if:
.. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size}
where :math:`\|\cdot\|_\infty` refers to the max norm, and
:math:`\Delta x_k` is the step change in the parameter values at
the :math:`k`-th iteration.
.. member:: double GradientProblemSolver::Options::line_search_sufficient_function_decrease
Default: ``1e-4``
Solving the line search problem exactly is computationally
prohibitive. Fortunately, line search based optimization algorithms
can still guarantee convergence if instead of an exact solution,
the line search algorithm returns a solution which decreases the
value of the objective function sufficiently. More precisely, we
are looking for a step size s.t.
.. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]
This condition is known as the Armijo condition.
.. member:: double GradientProblemSolver::Options::max_line_search_step_contraction
Default: ``1e-3``
In each iteration of the line search,
.. math:: \text{new_step_size} \geq \text{max_line_search_step_contraction} * \text{step_size}
Note that by definition, for contraction:
.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
.. member:: double GradientProblemSolver::Options::min_line_search_step_contraction
Default: ``0.6``
In each iteration of the line search,
.. math:: \text{new_step_size} \leq \text{min_line_search_step_contraction} * \text{step_size}
Note that by definition, for contraction:
.. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1
.. member:: int GradientProblemSolver::Options::max_num_line_search_step_size_iterations
Default: ``20``
Maximum number of trial step size iterations during each line
search, if a step size satisfying the search conditions cannot be
found within this number of trials, the line search will stop.
As this is an 'artificial' constraint (one imposed by the user, not
the underlying math), if ``WOLFE`` line search is being used, *and*
points satisfying the Armijo sufficient (function) decrease
condition have been found during the current search (in :math:`\leq`
``max_num_line_search_step_size_iterations``). Then, the step size
with the lowest function value which satisfies the Armijo condition
will be returned as the new valid step, even though it does *not*
satisfy the strong Wolfe conditions. This behaviour protects
against early termination of the optimizer at a sub-optimal point.
.. member:: int GradientProblemSolver::Options::max_num_line_search_direction_restarts
Default: ``5``
Maximum number of restarts of the line search direction algorithm
before terminating the optimization. Restarts of the line search
direction algorithm occur when the current algorithm fails to
produce a new descent direction. This typically indicates a
numerical failure, or a breakdown in the validity of the
approximations used.
.. member:: double GradientProblemSolver::Options::line_search_sufficient_curvature_decrease
Default: ``0.9``
The strong Wolfe conditions consist of the Armijo sufficient
decrease condition, and an additional requirement that the
step size be chosen s.t. the *magnitude* ('strong' Wolfe
conditions) of the gradient along the search direction
decreases sufficiently. Precisely, this second condition
is that we seek a step size s.t.
.. math:: \|f'(\text{step_size})\| \leq \text{sufficient_curvature_decrease} * \|f'(0)\|
Where :math:`f()` is the line search objective and :math:`f'()` is the derivative
of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`.
.. member:: double GradientProblemSolver::Options::max_line_search_step_expansion
Default: ``10.0``
During the bracketing phase of a Wolfe line search, the step size
is increased until either a point satisfying the Wolfe conditions
is found, or an upper bound for a bracket containing a point
satisfying the conditions is found. Precisely, at each iteration
of the expansion:
.. math:: \text{new_step_size} \leq \text{max_step_expansion} * \text{step_size}
By definition for expansion
.. math:: \text{max_step_expansion} > 1.0
.. member:: int GradientProblemSolver::Options::max_num_iterations
Default: ``50``
Maximum number of iterations for which the solver should run.
.. member:: double GradientProblemSolver::Options::max_solver_time_in_seconds
Default: ``1e6``
Maximum amount of time for which the solver should run.
.. member:: double GradientProblemSolver::Options::function_tolerance
Default: ``1e-6``
Solver terminates if
.. math:: \frac{|\Delta \text{cost}|}{\text{cost}} \leq \text{function_tolerance}
where, :math:`\Delta \text{cost}` is the change in objective
function value (up or down) in the current iteration of the line search.
.. member:: double GradientProblemSolver::Options::gradient_tolerance
Default: ``1e-10``
Solver terminates if
.. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty \leq \text{gradient_tolerance}
where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi`
is projection onto the bounds constraints and :math:`\boxplus` is
Plus operation for the manifold associated with the parameter
vector.
.. member:: double GradientProblemSolver::Options::parameter_tolerance
Default: ``1e-8``
Solver terminates if
.. math:: \|\Delta x\| \leq (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}
where :math:`\Delta x` is the step computed by the linear solver in
the current iteration of the line search.
.. member:: LoggingType GradientProblemSolver::Options::logging_type
Default: ``PER_MINIMIZER_ITERATION``
.. member:: bool GradientProblemSolver::Options::minimizer_progress_to_stdout
Default: ``false``
By default the :class:`Minimizer` progress is logged to ``STDERR``
depending on the ``vlog`` level. If this flag is set to true, and
:member:`GradientProblemSolver::Options::logging_type` is not
``SILENT``, the logging output is sent to ``STDOUT``.
The progress display looks like
.. code-block:: bash
0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02
1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01
2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
Here
#. ``f`` is the value of the objective function.
#. ``d`` is the change in the value of the objective function if
the step computed in this iteration is accepted.
#. ``g`` is the max norm of the gradient.
#. ``h`` is the change in the parameter vector.
#. ``s`` is the optimal step length computed by the line search.
#. ``it`` is the time take by the current iteration.
#. ``tt`` is the total time taken by the minimizer.
.. member:: vector<IterationCallback> GradientProblemSolver::Options::callbacks
Callbacks that are executed at the end of each iteration of the
:class:`Minimizer`. They are executed in the order that they are
specified in this vector. By default, parameter blocks are updated
only at the end of the optimization, i.e., when the
:class:`Minimizer` terminates. This behavior is controlled by
:member:`GradientProblemSolver::Options::update_state_every_variable`. If
the user wishes to have access to the update parameter blocks when
his/her callbacks are executed, then set
:member:`GradientProblemSolver::Options::update_state_every_iteration`
to true.
The solver does NOT take ownership of these pointers.
.. member:: bool Solver::Options::update_state_every_iteration
Default: ``false``
Normally the parameter vector is only updated when the solver
terminates. Setting this to true updates it every iteration. This
setting is useful when building an interactive application using
Ceres and using an :class:`IterationCallback`.
:class:`GradientProblemSolver::Summary`
---------------------------------------
.. class:: GradientProblemSolver::Summary
Summary of the various stages of the solver after termination.
.. function:: string GradientProblemSolver::Summary::BriefReport() const
A brief one line description of the state of the solver after
termination.
.. function:: string GradientProblemSolver::Summary::FullReport() const
A full multiline description of the state of the solver after
termination.
.. function:: bool GradientProblemSolver::Summary::IsSolutionUsable() const
Whether the solution returned by the optimization algorithm can be
relied on to be numerically sane. This will be the case if
`GradientProblemSolver::Summary:termination_type` is set to `CONVERGENCE`,
`USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver
converged by meeting one of the convergence tolerances or because
the user indicated that it had converged or it ran to the maximum
number of iterations or time.
.. member:: TerminationType GradientProblemSolver::Summary::termination_type
The cause of the minimizer terminating.
.. member:: string GradientProblemSolver::Summary::message
Reason why the solver terminated.
.. member:: double GradientProblemSolver::Summary::initial_cost
Cost of the problem (value of the objective function) before the
optimization.
.. member:: double GradientProblemSolver::Summary::final_cost
Cost of the problem (value of the objective function) after the
optimization.
.. member:: vector<IterationSummary> GradientProblemSolver::Summary::iterations
:class:`IterationSummary` for each minimizer iteration in order.
.. member:: int num_cost_evaluations
Number of times the cost (and not the gradient) was evaluated.
.. member:: int num_gradient_evaluations
Number of times the gradient (and the cost) were evaluated.
.. member:: double GradientProblemSolver::Summary::total_time_in_seconds
Time (in seconds) spent in the solver.
.. member:: double GradientProblemSolver::Summary::cost_evaluation_time_in_seconds
Time (in seconds) spent evaluating the cost vector.
.. member:: double GradientProblemSolver::Summary::gradient_evaluation_time_in_seconds
Time (in seconds) spent evaluating the gradient vector.
.. member:: int GradientProblemSolver::Summary::num_parameters
Number of parameters in the problem.
.. member:: int GradientProblemSolver::Summary::num_local_parameters
Dimension of the tangent space of the problem. This is different
from :member:`GradientProblemSolver::Summary::num_parameters` if a
:class:`LocalParameterization`/:class:`Manifold` object is used.
.. NOTE::
``num_local_parameters`` is deprecated and will be removed in
Ceres Solver version 2.2.0. Please use ``num_tangent_parameters``
instead.
.. member:: int GradientProblemSolver::Summary::num_tangent_parameters
Dimension of the tangent space of the problem. This is different
from :member:`GradientProblemSolver::Summary::num_parameters` if a
:class:`LocalParameterization`/:class:`Manifold` object is used.
.. member:: LineSearchDirectionType GradientProblemSolver::Summary::line_search_direction_type
Type of line search direction used.
.. member:: LineSearchType GradientProblemSolver::Summary::line_search_type
Type of the line search algorithm used.
.. member:: LineSearchInterpolationType GradientProblemSolver::Summary::line_search_interpolation_type
When performing line search, the degree of the polynomial used to
approximate the objective function.
.. member:: NonlinearConjugateGradientType GradientProblemSolver::Summary::nonlinear_conjugate_gradient_type
If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`,
then this indicates the particular variant of non-linear conjugate
gradient used.
.. member:: int GradientProblemSolver::Summary::max_lbfgs_rank
If the type of the line search direction is `LBFGS`, then this
indicates the rank of the Hessian approximation.
|