File: float_next.qbk

package info (click to toggle)
scipy 1.16.0-1exp7
links: PTS, VCS
area: main
in suites: experimental
size: 234,820 kB
sloc: cpp: 503,145; python: 344,611; ansic: 195,638; javascript: 89,566; fortran: 56,210; cs: 3,081; f90: 1,150; sh: 848; makefile: 785; pascal: 284; csh: 135; lisp: 134; xml: 56; perl: 51
file content (292 lines) | stat: -rw-r--r-- 11,329 bytes
parent folder | download | duplicates (5)
[section:next_float Floating-Point Representation Distance (ULP),
   and Finding Adjacent Floating-Point Values]

[@http://en.wikipedia.org/wiki/Unit_in_the_last_place Unit of Least Precision or Unit in the Last Place]
is the gap between two different, but as close as possible, floating-point numbers.

Most decimal values, for example 0.1, cannot be exactly represented as floating-point values,
but will be stored as the
[@http://en.wikipedia.org/wiki/Floating_point#Representable_numbers.2C_conversion_and_rounding
closest representable floating-point].

Functions are provided for finding adjacent greater and lesser floating-point values,
and estimating the number of gaps between any two floating-point values.

The floating-point type (FPT) must have has a fixed number of bits in the representation.
The number of bits may set at runtime, but must be the same for all numbers.
For example, __NTL_quad_float type (fixed 128-bit representation),
__NTL_RR type (arbitrary but fixed decimal digits, default 150) or
__multiprecision __cpp_dec_float and__cpp_bin_float  are fixed at runtime,
but [*not] a type that extends the representation to provide an exact representation
for any number, for example [@http://keithbriggs.info/xrc.html XRC eXact Real in C].

The accuracy of mathematical functions can be assessed and displayed in terms of __ULP,
often as a ulps plot or by binning the differences as a histogram.
Samples are evaluated using the implementation under test and compared with 'known good'
representation obtained using a more accurate method.  Other implementations, often using
arbitrary precision arithmetic, for example __WolframAlpha are one source of references
values.  The other method, used widely in Boost.Math special functions, it to carry out
the same algorithm, but using a higher precision type, typically using Boost.Multiprecision
types like `cpp_bin_float_quad` for 128-bit (about 35 decimal digit precision), or
`cpp_bin_float_50` (for 50 decimal digit precision).

When converted to a particular machine representation, say `double`, say using a `static_cast`,
the value is the nearest representation possible for the `double` type.  This value
cannot be 'wrong' by more than half a __ulp, and can be obtained using the Boost.Math function `ulp`.
(Unless the algorithm is fundamentally flawed, something that should be revealed by 'sanity'
checks using some independent sources).

See some discussion and example plots by Cleve Moler of Mathworks
[@https://blogs.mathworks.com/cleve/2017/01/23/ulps-plots-reveal-math-function-accurary/
ulps plots reveal math-function accuracy].

[section:nextafter Finding the Next Representable Value in a Specific Direction (nextafter)]

[h4 Synopsis]

``
#include <boost/math/special_functions/next.hpp>
``

  namespace boost{ namespace math{

  template <class FPT>
  FPT nextafter(FPT val, FPT direction);

  }} // namespaces

[h4 Description - nextafter]

This is an implementation of the `nextafter` function included in the C99 standard.
(It is also effectively an implementation of the C99 `nexttoward` legacy function
which differs only having a `long double` direction,
and can generally serve in its place if required).

[note The C99 functions must use suffixes f and l to distinguish `float` and `long double` versions.
C++ uses the template mechanism instead.]

Returns the next representable value after /x/ in the direction of /y/.  If
`x == y` then returns /x/.  If /x/ is non-finite then returns the result of
a __domain_error.  If there is no such value in the direction of /y/ then
returns an __overflow_error.

[warning The template parameter FTP must be a floating-point type.
An integer type, for example, will produce an unhelpful error message.]

[tip Nearly always, you just want the next or prior representable value,
so instead use `float_next` or `float_prior` below.]

[h4 Examples - nextafter]

The two representations using a 32-bit float either side of unity are:
``
The nearest (exact) representation of 1.F is      1.00000000
nextafter(1.F, 999) is                            1.00000012
nextafter(1/f, -999) is                           0.99999994

The nearest (not exact) representation of 0.1F is 0.100000001
nextafter(0.1F, 10) is                            0.100000009
nextafter(0.1F, 10) is                            0.099999994
``

[endsect] [/section:nextafter Finding the Next Representable Value in a Specific Direction (nextafter)]

[section:float_next Finding the Next Greater Representable Value (float_next)]

[h4 Synopsis]

``
#include <boost/math/special_functions/next.hpp>
``

   namespace boost{ namespace math{

   template <class FPT>
   FPT float_next(FPT val);

   }} // namespaces

[h4 Description - float_next]

Returns the next representable value which is greater than /x/.
If /x/ is non-finite then returns the result of
a __domain_error.  If there is no such value greater than /x/ then
returns an __overflow_error.

Has the same effect as

  nextafter(val, (std::numeric_limits<FPT>::max)());

[endsect] [/section:float_next Finding the Next Greater Representable Value (float_prior)]

[section:float_prior Finding the Next Smaller Representable Value (float_prior)]

[h4 Synopsis]

``
#include <boost/math/special_functions/next.hpp>
``

   namespace boost{ namespace math{

   template <class FPT>
   FPT float_prior(FPT val);

   }} // namespaces


[h4 Description - float_prior]

Returns the next representable value which is less than /x/.
If /x/ is non-finite then returns the result of
a __domain_error.  If there is no such value less than /x/ then
returns an __overflow_error.

Has the same effect as

  nextafter(val, -(std::numeric_limits<FPT>::max)());  // Note most negative value -max.

[endsect] [/section:float_prior Finding the Next Smaller Representable Value (float_prior)]

[section:float_distance Calculating the Representation Distance
   Between Two floating-point Values (ULP) float_distance]

Function float_distance finds the number of gaps/bits/ULP between any two floating-point values.
If the significands of floating-point numbers are viewed as integers,
then their difference is the number of ULP/gaps/bits different.

[h4 Synopsis]

``
#include <boost/math/special_functions/next.hpp>
``

   namespace boost{ namespace math{

   template <class FPT>
   FPT float_distance(FPT a, FPT b);

   }} // namespaces

[h4 Description - float_distance]

Returns the distance between /a/ and /b/: the result is always
a signed integer value (stored in floating-point type FPT)
representing the number of distinct representations between /a/ and /b/.

Note that

* `float_distance(a, a)` always returns 0.
* `float_distance(float_next(a), a)` always returns -1.
* `float_distance(float_prior(a), a)` always returns 1.

The function `float_distance` is equivalent to calculating the number
of ULP (Units in the Last Place) between /a/ and /b/ except that it
returns a signed value indicating whether `a > b` or not.

If the distance is too great then it may not be able
to be represented as an exact integer by type FPT,
but in practice this is unlikely to be a issue.

[endsect] [/section:float_distance Calculating the Representation Distance
   Between Two floating-point Values (ULP) float_distance]

[section:float_advance Advancing a floating-point Value by a Specific
Representation Distance (ULP) float_advance]

Function `float_advance` advances a floating-point number by a specified number
of ULP.

[h4 Synopsis]

``
#include <boost/math/special_functions/next.hpp>
``

   namespace boost{ namespace math{

   template <class FPT>
   FPT float_advance(FPT val, int distance);

   }} // namespaces

[h4 Description - float_advance]

Returns a floating-point number /r/ such that `float_distance(val, r) == distance`.

[endsect] [/section:float_advance]

[section:ulp Obtaining the Size of a Unit In the Last Place - ULP]

Function `ulp` gives the size of a unit-in-the-last-place for a specified floating-point value.

[h4 Synopsis]

``
#include <boost/math/special_functions/ulp.hpp>
``

   namespace boost{ namespace math{

   template <class FPT>
   FPT ulp(const FPT& x);

   template <class FPT, class Policy>
   FPT ulp(const FPT& x, const Policy&);

   }} // namespaces

[h4 Description - ulp]

Returns one [@http://en.wikipedia.org/wiki/Unit_in_the_last_place unit in the last place] of ['x].

Corner cases are handled as follows:

* If the argument is a NaN, then raises a __domain_error.
* If the argument is an infinity, then raises an __overflow_error.
* If the argument is zero then returns the smallest representable value: for example for type
`double` this would be either `std::numeric_limits<double>::min()` or `std::numeric_limits<double>::denorm_min()`
depending whether denormals are supported (which have the values `2.2250738585072014e-308` and `4.9406564584124654e-324` respectively).
* If the result is too small to represent, then returns the smallest representable value.
* Always returns a positive value such that `ulp(x) == ulp(-x)`.

[*Important:]  The behavior of this function is aligned to that of [@http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#ulp%28double%29
Java's ulp function], please note
however that this function should only ever be used for rough and ready calculations as there are enough
corner cases to trap even careful programmers.  In particular:

* The function is asymmetrical, which is to say, given `u = ulp(x)` if `x > 0` then `x + u` is the
next floating-point value, but `x - u` is not necessarily the previous value.  Similarly, if
`x < 0` then `x - u` is the previous floating-point value, but `x + u` is not necessarily the next
value.  The corner cases occur at power of 2 boundaries.
* When the argument becomes very small, it may be that there is no floating-point value that
represents one ULP.  Whether this is the case or not depends not only on whether the hardware
may ['sometimes] support denormals (as signalled by `boost::math::detail::has_denorm_now<FPT>()`), but also whether these are
currently enabled at runtime (for example on SSE hardware, the DAZ or FTZ flags will disable denormal support).
In this situation, the `ulp` function may return a value that is many orders of magnitude too large.

In light of the issues above, we recommend that:

* To move between adjacent floating-point values always use __float_next, __float_prior or __nextafter (`std::nextafter`
is another candidate, but our experience is that this also often breaks depending which optimizations and
hardware flags are in effect).
* To move several floating-point values away use __float_advance.
* To calculate the edit distance between two floats use __float_distance.

There is none the less, one important use case for this function:

If it is known that the true result of some function is x[sub t] and the calculated result
is x[sub c], then the error measured in ulp is simply [^fabs(x[sub t] - x[sub c]) / ulp(x[sub t])].

[endsect] [/section ulp]

[endsect] [/ section:next_float Floating-Point Representation Distance (ULP),
   and Finding Adjacent Floating-Point Values]

[/
  Copyright 2008 John Maddock and Paul A. Bristow.
  Distributed under the Boost Software License, Version 1.0.
  (See accompanying file LICENSE_1_0.txt or copy at
  http://www.boost.org/LICENSE_1_0.txt).
]