File: floats.xml

package info (click to toggle)
gap 4.15.1-1
  • links: PTS
  • area: main
  • in suites: forky, sid
  • size: 110,212 kB
  • sloc: ansic: 97,261; xml: 48,343; cpp: 13,946; sh: 4,900; perl: 1,650; javascript: 255; makefile: 252; ruby: 9
file content (267 lines) | stat: -rw-r--r-- 9,933 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
<Chapter Label="Floats">
<Heading>Floats</Heading>

Starting with version 4.5, &GAP; has built-in support for
floating-point numbers in machine format, and allows package to
implement arbitrary-precision floating-point arithmetic in a uniform
manner. For now, one such package, <Package>Float</Package> exists,
and is based on the arbitrary-precision routines
in <Package>mpfr</Package>.

<P/> A word of caution: &GAP; deals primarily with algebraic objects,
which can be represented exactly in a computer. Numerical imprecision
means that floating-point numbers do not form a ring in the strict
&GAP; sense, because addition is in general not associative
(<C>(1.0e-100+1.0)-1.0</C> is not the same
as <C>1.0e-100+(1.0-1.0)</C>, in the default precision setting).

<P/> Most algorithms in &GAP; which require ring elements will
therefore not be applicable to floating-point elements. In some cases,
such a notion would not even make any sense (what is the greatest
common divisor of two floating-point numbers?)

<Section><Heading>A sample run</Heading>

Floating-point numbers can be input into &GAP; in the standard
floating-point notation:
<P/>
<Example><![CDATA[
gap> 3.14;
3.14
gap> last^2/6;
1.64327
gap> h := 6.62606896e-34;
6.62607e-34
gap> pi := 4*Atan(1.0);
3.14159
gap> hbar := h/(2*pi);
1.05457e-34
]]></Example>

<P/> Floating-point numbers can also be created using <C>Float</C>,
from strings or rational numbers; and can be converted back
using <C>String,Rat,Int</C>.

<P/> &GAP; allows rational and floating-point numbers to be mixed in
the elementary operations <C>+,-,*,/</C>. However, floating-point
numbers and rational numbers may not be compared. Conversions are
performed using the creator <C>Float</C>:
<P/>
<Example><![CDATA[
gap> Float("3.1416");
3.1416
gap> Float(355/113);
3.14159
gap> Rat(last);
355/113
gap> Rat(0.33333);
1/3
gap> Int(1.e10);
10000000000
gap> Int(1.e20);
100000000000000000000
gap> Int(1.e30);
1000000000000000019884624838656
]]></Example>
</Section>

<Section><Heading>Methods</Heading>

Floating-point numbers may be directly input, as in any usual
mathematical software or language; with the exception that every
floating-point number must contain a decimal
digit. Therefore <C>.1</C>, <C>.1e1</C>, <C>-.999</C> etc. are all valid
&GAP; inputs.
<P/>

Floating-point numbers so entered in &GAP; are stored as strings. They
are converted to floating-point when they are first used. This means that,
if the floating-point precision is increased, the constants are
reevaluated to fit the new format.
<P/>

Floating-point numbers may be followed by an underscore, as
in <C>1._</C>. This means that they are to be immediately converted to
the current floating-point format. The underscore may be followed by a
single letter, which specifies which format/precision to use. By
default, &GAP; has a single floating-point handler, with fixed (53
bits) precision, and its format specifier is <C>'l'</C> as
in <C>1._l</C>. Higher-precision floating-point computations is
available via external packages; <Package>float</Package> for example.
<P/>

A record, <Ref Var="FLOAT" Label="constants"/>,
contains all relevant constants for the
current floating-point format; see its documentation for details.
Typical fields are <C>FLOAT.MANT_DIG=53</C>, the
constant <C>FLOAT.VIEW_DIG=6</C> specifying the number of digits to
view, and <C>FLOAT.PI</C> for the constant <M>\pi</M>. The constants
have the same name as their C counterparts, except for the missing
initial <C>DBL_</C> or <C>M_</C>.
<P/>

Floating-point numbers may be created using the single function
<Ref Func="Float"/>, which accepts as arguments rational, string,
or floating-point numbers. Floating-point numbers may
also be created, in any floating-point representation, using
<Ref Constr="NewFloat"/> as in <C>NewFloat(IsIEEE754FloatRep,355/113)</C>,
by supplying the category filter of the desired new floating-point number;
or using <Ref Oper="MakeFloat"/> as in <C>MakeFloat(1.0,355/113)</C>,
by supplying a sample floating-point number.
<P/>

Floating-point numbers may also be converted to other &GAP; formats
using the usual commands <Ref Attr="Int"/>, <Ref Attr="Rat"/>,
<Ref Attr="String"/>.
<P/>

Exact conversion to and from floating-point format may be done using
external representations. The "external representation" of a
floating-point number <C>x</C> is a pair <C>[m,e]</C> of integers,
such that <C>x=m*2^(-1+e-LogInt(AbsInt(m),2))</C>. Conversion to and from
external representation is performed as usual using
<Ref Oper="ExtRepOfObj"/> and <Ref Oper="ObjByExtRep"/>:
<Example><![CDATA[
gap> ExtRepOfObj(3.14);
[ 7070651414971679, 2 ]
gap> ObjByExtRep(IEEE754FloatsFamily,last);
3.14
]]></Example>
<P/>

Computations with floating-point numbers never raise any
error. Division by zero is allowed, and produces a signed
infinity. Illegal operations, such as <C>0./0.</C>,
produce <K>NaN</K>'s (not-a-number); this is the only floating-point
number <C>x</C> such that <C>not EqFloat(x+0.0,x)</C>.
<P/>

The IEEE754 standard requires <K>NaN</K> to be non-equal to itself. On
the other hand, &GAP; requires every object to be equal to itself. To
respect the IEEE754 standard, the function <Ref Oper="EqFloat"/>
should be used instead of <C>=</C>.
<P/>

The category a floating-point belongs to can be checked using the
filters <Ref Prop="IsFinite"/>, <Ref Prop="IsPInfinity"/>,
<Ref Prop="IsNInfinity"/>, <Ref Prop="IsXInfinity"/>,
<Ref Prop="IsNaN"/>.
<P/>

Comparisons between floating-point numbers and rationals are
explicitly forbidden. The rationale is that objects belonging to
different families should in general not be comparable in
&GAP;. Floating-point numbers are also approximations of real numbers,
and don't follow the same rules; consider for example, using the
default &GAP; implementation of floating-point numbers,
<Example><![CDATA[
gap> 1.0/3.0 = Float(1/3);
true
gap> (1.0/3.0)^5 = Float((1/3)^5);
false
]]></Example>
<P/>

<#Include Label="Float">

<ManSection>
  <Var Name="FLOAT" Label="constants"/>
  <Description>
    This record contains useful floating-point constants: <List>
    <Mark>DECIMAL_DIG</Mark> <Item>Maximal number of useful digits;</Item>
    <Mark>DIG</Mark> <Item>Number of significant digits;</Item>
    <Mark>VIEW_DIG</Mark> <Item>Number of digits to print in short view;</Item>
    <Mark>EPSILON</Mark> <Item>Smallest number such that <M>1\neq1+\epsilon</M>;</Item>
    <Mark>MANT_DIG</Mark> <Item>Number of bits in the mantissa;</Item>
    <Mark>MAX</Mark> <Item>Maximal representable number;</Item>
    <Mark>MAX_10_EXP</Mark> <Item>Maximal decimal exponent;</Item>
    <Mark>MAX_EXP</Mark> <Item>Maximal binary exponent;</Item>
    <Mark>MIN</Mark> <Item>Minimal positive representable number;</Item>
    <Mark>MIN_10_EXP</Mark> <Item>Minimal decimal exponent;</Item>
    <Mark>MIN_EXP</Mark> <Item>Minimal exponent;</Item>
    <Mark>INFINITY</Mark> <Item>Positive infinity;</Item>
    <Mark>NINFINITY</Mark> <Item>Negative infinity;</Item>
    <Mark>NAN</Mark> <Item>Not-a-number,</Item>
    </List>
    as well as mathematical constants <C>E</C>, <C>LOG2E</C>, <C>LOG10E</C>,
    <C>LN2</C>, <C>LN10</C>, <C>PI</C>, <C>PI_2</C>, <C>PI_4</C>,
    <C>1_PI</C>, <C>2_PI</C>, <C>2_SQRTPI</C>, <C>SQRT2</C>, <C>SQRT1_2</C>.
  </Description>
</ManSection>

<#Include Label="Float-Extra">
<#Include Label="Float-Infinities">
<#Include Label="Float-Math-Commands">

</Section>

<Section><Heading>High-precision-specific methods</Heading>

&GAP; provides a mechanism for packages to implement new
floating-point numerical interfaces. The following describes that
mechanism, actual examples of packages are documented separately.
<P/>

A package must create a record with fields (all optional) <List>
<Mark>creator</Mark> <Item>a function converting strings to floating-point;</Item>
<Mark>eager</Mark> <Item>a character allowing immediate conversion
  to floating-point;</Item>
<Mark>objbyextrep</Mark> <Item>a function creating a floating-point
  number out of a list <C>[mantissa,exponent]</C>;</Item>
<Mark>filter</Mark> <Item>a filter for the new floating-point objects;</Item>
<Mark>constants</Mark> <Item>a record containing numerical constants,
  such as <C>MANT_DIG</C>, <C>MAX</C>, <C>MIN</C>, <C>NAN</C>.</Item>
</List>
<P/>

The package must install methods <C>Int</C>, <C>Rat</C>, <C>String</C>
for its objects, and
creators <C>NewFloat(filter,IsRat)</C>, <C>NewFloat(IsString)</C>.
<P/>

It must then install methods for all arithmetic and numerical
operations: <C>SUM</C>, <C>Exp</C>, ...
<P/>

The user chooses that implementation by calling
<Ref Func="SetFloats"/> with the record as argument, and with an
optional second argument requesting a precision in binary digits.

</Section>

<Section><Heading>Complex arithmetic</Heading>

Complex arithmetic may be implemented in packages, and is present in
<Package>float</Package>. Complex numbers are treated as usual
numbers; they may be input with an extra "i" as in
<C>-0.5+0.866i</C>. They may also be created using <Ref
Constr="NewFloat"/> with three arguments: the float filter, the real
part, and the imaginary part.
<P/>

Methods should then be implemented for <C>Norm</C>, <C>RealPart</C>,
<C>ImaginaryPart</C>, <C>ComplexConjugate</C>, ...

<#Include Label="Float-Complex">

</Section>

<Section><Heading>Interval-specific methods</Heading>

Interval arithmetic may also be implemented in packages. Intervals
are in fact efficient implementations of sets of real numbers. The
only non-trivial issue is how they should be compared. The standard
<C>EQ</C> tests if the intervals are equal; however, it is usually
more useful to know if intervals overlap, or are disjoint, or are
contained in each other.
<P/>

Note the usual convention that intervals are compared as
in <M>[a,b]\leq[c,d]</M> if and only if <M>a\leq c</M> and <M>b\leq
d</M>.

<#Include Label="Float-Intervals">

</Section>

</Chapter>