1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267
|
<Chapter Label="Floats">
<Heading>Floats</Heading>
Starting with version 4.5, &GAP; has built-in support for
floating-point numbers in machine format, and allows package to
implement arbitrary-precision floating-point arithmetic in a uniform
manner. For now, one such package, <Package>Float</Package> exists,
and is based on the arbitrary-precision routines
in <Package>mpfr</Package>.
<P/> A word of caution: &GAP; deals primarily with algebraic objects,
which can be represented exactly in a computer. Numerical imprecision
means that floating-point numbers do not form a ring in the strict
&GAP; sense, because addition is in general not associative
(<C>(1.0e-100+1.0)-1.0</C> is not the same
as <C>1.0e-100+(1.0-1.0)</C>, in the default precision setting).
<P/> Most algorithms in &GAP; which require ring elements will
therefore not be applicable to floating-point elements. In some cases,
such a notion would not even make any sense (what is the greatest
common divisor of two floating-point numbers?)
<Section><Heading>A sample run</Heading>
Floating-point numbers can be input into &GAP; in the standard
floating-point notation:
<P/>
<Example><![CDATA[
gap> 3.14;
3.14
gap> last^2/6;
1.64327
gap> h := 6.62606896e-34;
6.62607e-34
gap> pi := 4*Atan(1.0);
3.14159
gap> hbar := h/(2*pi);
1.05457e-34
]]></Example>
<P/> Floating-point numbers can also be created using <C>Float</C>,
from strings or rational numbers; and can be converted back
using <C>String,Rat,Int</C>.
<P/> &GAP; allows rational and floating-point numbers to be mixed in
the elementary operations <C>+,-,*,/</C>. However, floating-point
numbers and rational numbers may not be compared. Conversions are
performed using the creator <C>Float</C>:
<P/>
<Example><![CDATA[
gap> Float("3.1416");
3.1416
gap> Float(355/113);
3.14159
gap> Rat(last);
355/113
gap> Rat(0.33333);
1/3
gap> Int(1.e10);
10000000000
gap> Int(1.e20);
100000000000000000000
gap> Int(1.e30);
1000000000000000019884624838656
]]></Example>
</Section>
<Section><Heading>Methods</Heading>
Floating-point numbers may be directly input, as in any usual
mathematical software or language; with the exception that every
floating-point number must contain a decimal
digit. Therefore <C>.1</C>, <C>.1e1</C>, <C>-.999</C> etc. are all valid
&GAP; inputs.
<P/>
Floating-point numbers so entered in &GAP; are stored as strings. They
are converted to floating-point when they are first used. This means that,
if the floating-point precision is increased, the constants are
reevaluated to fit the new format.
<P/>
Floating-point numbers may be followed by an underscore, as
in <C>1._</C>. This means that they are to be immediately converted to
the current floating-point format. The underscore may be followed by a
single letter, which specifies which format/precision to use. By
default, &GAP; has a single floating-point handler, with fixed (53
bits) precision, and its format specifier is <C>'l'</C> as
in <C>1._l</C>. Higher-precision floating-point computations is
available via external packages; <Package>float</Package> for example.
<P/>
A record, <Ref Var="FLOAT" Label="constants"/>,
contains all relevant constants for the
current floating-point format; see its documentation for details.
Typical fields are <C>FLOAT.MANT_DIG=53</C>, the
constant <C>FLOAT.VIEW_DIG=6</C> specifying the number of digits to
view, and <C>FLOAT.PI</C> for the constant <M>\pi</M>. The constants
have the same name as their C counterparts, except for the missing
initial <C>DBL_</C> or <C>M_</C>.
<P/>
Floating-point numbers may be created using the single function
<Ref Func="Float"/>, which accepts as arguments rational, string,
or floating-point numbers. Floating-point numbers may
also be created, in any floating-point representation, using
<Ref Constr="NewFloat"/> as in <C>NewFloat(IsIEEE754FloatRep,355/113)</C>,
by supplying the category filter of the desired new floating-point number;
or using <Ref Oper="MakeFloat"/> as in <C>MakeFloat(1.0,355/113)</C>,
by supplying a sample floating-point number.
<P/>
Floating-point numbers may also be converted to other &GAP; formats
using the usual commands <Ref Attr="Int"/>, <Ref Attr="Rat"/>,
<Ref Attr="String"/>.
<P/>
Exact conversion to and from floating-point format may be done using
external representations. The "external representation" of a
floating-point number <C>x</C> is a pair <C>[m,e]</C> of integers,
such that <C>x=m*2^(-1+e-LogInt(AbsInt(m),2))</C>. Conversion to and from
external representation is performed as usual using
<Ref Oper="ExtRepOfObj"/> and <Ref Oper="ObjByExtRep"/>:
<Example><![CDATA[
gap> ExtRepOfObj(3.14);
[ 7070651414971679, 2 ]
gap> ObjByExtRep(IEEE754FloatsFamily,last);
3.14
]]></Example>
<P/>
Computations with floating-point numbers never raise any
error. Division by zero is allowed, and produces a signed
infinity. Illegal operations, such as <C>0./0.</C>,
produce <K>NaN</K>'s (not-a-number); this is the only floating-point
number <C>x</C> such that <C>not EqFloat(x+0.0,x)</C>.
<P/>
The IEEE754 standard requires <K>NaN</K> to be non-equal to itself. On
the other hand, &GAP; requires every object to be equal to itself. To
respect the IEEE754 standard, the function <Ref Oper="EqFloat"/>
should be used instead of <C>=</C>.
<P/>
The category a floating-point belongs to can be checked using the
filters <Ref Prop="IsFinite"/>, <Ref Prop="IsPInfinity"/>,
<Ref Prop="IsNInfinity"/>, <Ref Prop="IsXInfinity"/>,
<Ref Prop="IsNaN"/>.
<P/>
Comparisons between floating-point numbers and rationals are
explicitly forbidden. The rationale is that objects belonging to
different families should in general not be comparable in
&GAP;. Floating-point numbers are also approximations of real numbers,
and don't follow the same rules; consider for example, using the
default &GAP; implementation of floating-point numbers,
<Example><![CDATA[
gap> 1.0/3.0 = Float(1/3);
true
gap> (1.0/3.0)^5 = Float((1/3)^5);
false
]]></Example>
<P/>
<#Include Label="Float">
<ManSection>
<Var Name="FLOAT" Label="constants"/>
<Description>
This record contains useful floating-point constants: <List>
<Mark>DECIMAL_DIG</Mark> <Item>Maximal number of useful digits;</Item>
<Mark>DIG</Mark> <Item>Number of significant digits;</Item>
<Mark>VIEW_DIG</Mark> <Item>Number of digits to print in short view;</Item>
<Mark>EPSILON</Mark> <Item>Smallest number such that <M>1\neq1+\epsilon</M>;</Item>
<Mark>MANT_DIG</Mark> <Item>Number of bits in the mantissa;</Item>
<Mark>MAX</Mark> <Item>Maximal representable number;</Item>
<Mark>MAX_10_EXP</Mark> <Item>Maximal decimal exponent;</Item>
<Mark>MAX_EXP</Mark> <Item>Maximal binary exponent;</Item>
<Mark>MIN</Mark> <Item>Minimal positive representable number;</Item>
<Mark>MIN_10_EXP</Mark> <Item>Minimal decimal exponent;</Item>
<Mark>MIN_EXP</Mark> <Item>Minimal exponent;</Item>
<Mark>INFINITY</Mark> <Item>Positive infinity;</Item>
<Mark>NINFINITY</Mark> <Item>Negative infinity;</Item>
<Mark>NAN</Mark> <Item>Not-a-number,</Item>
</List>
as well as mathematical constants <C>E</C>, <C>LOG2E</C>, <C>LOG10E</C>,
<C>LN2</C>, <C>LN10</C>, <C>PI</C>, <C>PI_2</C>, <C>PI_4</C>,
<C>1_PI</C>, <C>2_PI</C>, <C>2_SQRTPI</C>, <C>SQRT2</C>, <C>SQRT1_2</C>.
</Description>
</ManSection>
<#Include Label="Float-Extra">
<#Include Label="Float-Infinities">
<#Include Label="Float-Math-Commands">
</Section>
<Section><Heading>High-precision-specific methods</Heading>
&GAP; provides a mechanism for packages to implement new
floating-point numerical interfaces. The following describes that
mechanism, actual examples of packages are documented separately.
<P/>
A package must create a record with fields (all optional) <List>
<Mark>creator</Mark> <Item>a function converting strings to floating-point;</Item>
<Mark>eager</Mark> <Item>a character allowing immediate conversion
to floating-point;</Item>
<Mark>objbyextrep</Mark> <Item>a function creating a floating-point
number out of a list <C>[mantissa,exponent]</C>;</Item>
<Mark>filter</Mark> <Item>a filter for the new floating-point objects;</Item>
<Mark>constants</Mark> <Item>a record containing numerical constants,
such as <C>MANT_DIG</C>, <C>MAX</C>, <C>MIN</C>, <C>NAN</C>.</Item>
</List>
<P/>
The package must install methods <C>Int</C>, <C>Rat</C>, <C>String</C>
for its objects, and
creators <C>NewFloat(filter,IsRat)</C>, <C>NewFloat(IsString)</C>.
<P/>
It must then install methods for all arithmetic and numerical
operations: <C>SUM</C>, <C>Exp</C>, ...
<P/>
The user chooses that implementation by calling
<Ref Func="SetFloats"/> with the record as argument, and with an
optional second argument requesting a precision in binary digits.
</Section>
<Section><Heading>Complex arithmetic</Heading>
Complex arithmetic may be implemented in packages, and is present in
<Package>float</Package>. Complex numbers are treated as usual
numbers; they may be input with an extra "i" as in
<C>-0.5+0.866i</C>. They may also be created using <Ref
Constr="NewFloat"/> with three arguments: the float filter, the real
part, and the imaginary part.
<P/>
Methods should then be implemented for <C>Norm</C>, <C>RealPart</C>,
<C>ImaginaryPart</C>, <C>ComplexConjugate</C>, ...
<#Include Label="Float-Complex">
</Section>
<Section><Heading>Interval-specific methods</Heading>
Interval arithmetic may also be implemented in packages. Intervals
are in fact efficient implementations of sets of real numbers. The
only non-trivial issue is how they should be compared. The standard
<C>EQ</C> tests if the intervals are equal; however, it is usually
more useful to know if intervals overlap, or are disjoint, or are
contained in each other.
<P/>
Note the usual convention that intervals are compared as
in <M>[a,b]\leq[c,d]</M> if and only if <M>a\leq c</M> and <M>b\leq
d</M>.
<#Include Label="Float-Intervals">
</Section>
</Chapter>
|