File: node133.html

package info (click to toggle)
scalapack-doc 1.5-11
links: PTS
area: main
in suites: bullseye, buster, stretch
size: 10,336 kB
ctags: 4,931
sloc: makefile: 47; sh: 18
file content (249 lines) | stat: -rw-r--r-- 15,966 bytes
parent folder | download | duplicates (4)
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<!--Converted with LaTeX2HTML 96.1-h (September 30, 1996) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds -->
<HTML>
<HEAD>
<TITLE>Sources of Error in Numerical Calculations</TITLE>
<META NAME="description" CONTENT="Sources of Error in Numerical Calculations">
<META NAME="keywords" CONTENT="slug">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<LINK REL=STYLESHEET HREF="slug.css">
</HEAD>
<BODY LANG="EN" >
 <A NAME="tex2html3878" HREF="node134.html"><IMG WIDTH=37 HEIGHT=24 ALIGN=BOTTOM ALT="next" SRC="http://www.netlib.org/utk/icons/next_motif.gif"></A> <A NAME="tex2html3876" HREF="node132.html"><IMG WIDTH=26 HEIGHT=24 ALIGN=BOTTOM ALT="up" SRC="http://www.netlib.org/utk/icons/up_motif.gif"></A> <A NAME="tex2html3870" HREF="node132.html"><IMG WIDTH=63 HEIGHT=24 ALIGN=BOTTOM ALT="previous" SRC="http://www.netlib.org/utk/icons/previous_motif.gif"></A> <A NAME="tex2html3880" HREF="node1.html"><IMG WIDTH=65 HEIGHT=24 ALIGN=BOTTOM ALT="contents" SRC="http://www.netlib.org/utk/icons/contents_motif.gif"></A> <A NAME="tex2html3881" HREF="node190.html"><IMG WIDTH=43 HEIGHT=24 ALIGN=BOTTOM ALT="index" SRC="http://www.netlib.org/utk/icons/index_motif.gif"></A> <BR>
<B> Next:</B> <A NAME="tex2html3879" HREF="node134.html">New Sources of Error </A>
<B>Up:</B> <A NAME="tex2html3877" HREF="node132.html">Accuracy and Stability</A>
<B> Previous:</B> <A NAME="tex2html3871" HREF="node132.html">Accuracy and Stability</A>
<BR> <P>
<H1><A NAME="SECTION04610000000000000000">Sources of Error in Numerical Calculations</A></H1>
<A NAME="secroundoff">&#160;</A>
<P>
<A NAME="4374">&#160;</A>
<A NAME="4375">&#160;</A>
<A NAME="4376">&#160;</A>
<A NAME="4377">&#160;</A>
<A NAME="4378">&#160;</A>
The effects of two sources of error can be measured by the bounds in
this chapter: <EM>roundoff error</EM> and <EM>input error</EM>. Roundoff error
arises from rounding results of floating-point operations during
the algorithm.
Input error is error in the input to the algorithm from prior calculations or 
measurements.  We describe roundoff error first, and then input error.
<P>
Almost all the error bounds ScaLAPACK provides are multiples of 
<EM>relative machine precision</EM>,<A NAME="4382">&#160;</A><A NAME="4383">&#160;</A><A NAME="4384">&#160;</A><A NAME="4385">&#160;</A>
which we abbreviate 
by <IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif">. Relative machine precision (epsilon) bounds the roundoff in
individual floating-point operations. 
It may be loosely defined as the largest relative error 
<A NAME="4386">&#160;</A><A NAME="4387">&#160;</A>
in any floating-point operation that neither overflows nor underflows.
(Overflow means the result is too
large to represent accurately, and underflow means the result is too
small to represent accurately.) Relative machine precision (epsilon) is
available either by 
the function call
<A NAME="4388">&#160;</A><A NAME="4389">&#160;</A>PSLAMCH(ICTXT, 'Epsilon') (or simply
PSLAMCH(ICTXT, 'E'))<A NAME="tex2html1079" HREF="footnode.html#8007"><IMG  ALIGN=BOTTOM ALT="gif" SRC="http://www.netlib.org/utk/icons/foot_motif.gif"></A> in single
precision, or
by the function call PDLAMCH(ICTXT, 'Epsilon') (or PDLAMCH(ICTXT, 'E'))
in double precision. 
See section&nbsp;<A HREF="node133.html#secbackgroundfloatingpoint">6.1</A> and
Table&nbsp;<A HREF="node133.html#tabIEEEvalues">6.1</A> for a discussion of common values of machine 
epsilon.
<A NAME="4393">&#160;</A><A NAME="4394">&#160;</A>
<A NAME="4395">&#160;</A><A NAME="4396">&#160;</A>
<P>
PDLAMCH(ICTXT,`E') returns a single value for the selected machine parameter
`E' on all processes within the context ICTXT. If these processes are
running on a network of heterogeneous 
processors<A NAME="4397">&#160;</A>,
with different floating-point arithmetics, then a ``safe'' common value
is returned, the maximum value of machine epsilon for all
the processors.
<P>
In case of overflow, 
<A NAME="4398">&#160;</A>
there are two common system responses:
stopping with an error message, or returning <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> and
continuing to compute. The latter is the default response of IEEE
standard floating-point arithmetic [<A HREF="node189.html#ieee754">7</A>, <A HREF="node189.html#ieee854">8</A>]
<A NAME="4400">&#160;</A>, 
the most commonly used arithmetic. It is possible to 
change this default to abort with an error message, which is
often useful for debugging.
<P>
In contrast to LAPACK, ScaLAPACK can take advantage of arithmetic
with <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> 
<A NAME="4401">&#160;</A>
to accelerate the routines that compute eigenvalues
of symmetric matrices using 
PxLAIECT
(the drivers PxSYEVX and PxSYGVX, and their complex counterparts).
<A NAME="4402">&#160;</A><A NAME="4403">&#160;</A><A NAME="4404">&#160;</A><A NAME="4405">&#160;</A>
<A NAME="4406">&#160;</A><A NAME="4407">&#160;</A><A NAME="4408">&#160;</A><A NAME="4409">&#160;</A>
PxLAIECT comes in two different versions, one in which arithmetic with
<IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> is available (the default) and one in which it is not.
When <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> is available, the inner loop of PxLAIECT is
accelerated by removing a branch to test for and avoid division by zero.
This speed advantage is realized only when arithmetic with <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif">
is as fast as arithmetic with normalized floating-point numbers; this
is usually but not always the case [<A HREF="node189.html#demmelli93">42</A>].
The compile time flag NO_IEEE can be used during installation to run
without using <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif">; see the ScaLAPACK Installation Guide 
for details [<A HREF="node189.html#lawn93">24</A>].
<P>
Since underflow is almost always less significant than roundoff, 
we will not consider it further in this section
(but see section&nbsp;<A HREF="node133.html#secbackgroundfloatingpoint">6.1</A>).
<P>
Bounds on <EM>input errors</EM>
may be easily incorporated
into most ScaLAPACK error bounds.
Suppose the input data is accurate to, say, five decimal digits 
(we discuss exactly what
this means in section&nbsp;<A HREF="node135.html#secnormnotation">6.3</A>). Then one simply replaces
<IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif"> by <IMG WIDTH=94 HEIGHT=30 ALIGN=MIDDLE ALT="tex2html_wrap_inline17220" SRC="img430.gif"> in the error bounds.
<P>
<B>Further Details:  Floating-Point Arithmetic</B><A NAME="secbackgroundfloatingpoint">&#160;</A>
<P>
<A NAME="4418">&#160;</A>
<A NAME="4419">&#160;</A>
Roundoff error is bounded in terms of the <EM>relative machine precision</EM>
<IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif">,<A NAME="4421">&#160;</A><A NAME="4422">&#160;</A>
which is the smallest value satisfying
<BR><IMG WIDTH=376 HEIGHT=18 ALIGN=BOTTOM ALT="displaymath17196" SRC="img431.gif"><BR>
where <I>a</I> and <I>b</I> are floating-point numbers<A NAME="4423">&#160;</A>,
<IMG WIDTH=12 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17228" SRC="img432.gif"> is any one of the four operations +, -, <IMG WIDTH=9 HEIGHT=16 ALIGN=MIDDLE ALT="tex2html_wrap_inline12420" SRC="img42.gif">, and <IMG WIDTH=12 HEIGHT=20 ALIGN=MIDDLE ALT="tex2html_wrap_inline17236" SRC="img433.gif">, 
and <IMG WIDTH=65 HEIGHT=26 ALIGN=MIDDLE ALT="tex2html_wrap_inline17238" SRC="img434.gif"> is the floating-point result of <IMG WIDTH=36 HEIGHT=25 ALIGN=MIDDLE ALT="tex2html_wrap_inline17240" SRC="img435.gif">.
Relative machine precision, <IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif">, is the smallest value for which this inequality 
is true for all <IMG WIDTH=12 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17228" SRC="img432.gif">, and for all <I>a</I> and <I>b</I> such that
<IMG WIDTH=36 HEIGHT=25 ALIGN=MIDDLE ALT="tex2html_wrap_inline17240" SRC="img435.gif"> is neither too large (magnitude exceeds the overflow 
threshold)<A NAME="4424">&#160;</A><A NAME="4425">&#160;</A>
nor too small
(is nonzero with magnitude less than the underflow threshold)<A NAME="4426">&#160;</A><A NAME="4427">&#160;</A>
to be represented accurately in the machine.
We also assume <IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif"> bounds the relative error in unary<A NAME="4428">&#160;</A><A NAME="4429">&#160;</A>
operations such as square root:
<BR><IMG WIDTH=352 HEIGHT=20 ALIGN=BOTTOM ALT="displaymath17197" SRC="img436.gif"><BR>
<P>
A precise characterization of <IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif"> depends on the details of the
machine arithmetic and sometimes even of the compiler.
For example, if addition and
subtraction are implemented without a guard digit,<A NAME="tex2html1107" HREF="footnode.html#4433"><IMG  ALIGN=BOTTOM ALT="gif" SRC="http://www.netlib.org/utk/icons/foot_motif.gif"></A>
we must redefine <IMG WIDTH=6 HEIGHT=8 ALIGN=BOTTOM ALT="tex2html_wrap_inline17202" SRC="img429.gif"> to be the smallest number
such that
<BR><IMG WIDTH=383 HEIGHT=18 ALIGN=BOTTOM ALT="displaymath17198" SRC="img437.gif"><BR>
<P>
In order to assure portability<A NAME="4434">&#160;</A>,
machine parameters such as relative machine precision (epsilon), the
overflow threshold and 
underflow threshold are computed at runtime by the auxiliary<A NAME="4435">&#160;</A><A NAME="4436">&#160;</A><A NAME="4437">&#160;</A>
routine PxLAMCH.
The alternative,
keeping a fixed table of machine parameter values, would degrade portability 
because the table would have to be changed when moving from one machine
or combination of machines, or even one compiler, to another.
<P>
Most machines (but not yet all) do have the same machine 
parameters because they implement IEEE Standard Floating Point Arithmetic
<A NAME="4438">&#160;</A>  
[<A HREF="node189.html#ieee754">7</A>, <A HREF="node189.html#ieee854">8</A>], which exactly specifies floating-point number
representations and operations. For
these machines, including all modern workstations and
PCs,<A NAME="tex2html1113" HREF="footnode.html#4440"><IMG  ALIGN=BOTTOM ALT="gif" SRC="http://www.netlib.org/utk/icons/foot_motif.gif"></A>
the values of these parameters are given in
Table&nbsp;<A HREF="node133.html#tabIEEEvalues">6.1</A>.
<P>
Unfortunately, machines claiming to implement IEEE arithmetic may still
compute different results from the same program and input.
Here are some examples.
Intel processors have 80-bit floating-point registers, and the fastest 
way to use them is to evaluate all results to 80-bit accuracy until they 
are stored back to memory in 32-bit or 64-bit format. 
The IBM RS/6000 has a fused multiply-add instruction that evaluates
<I>a</I>+<I>b</I>*<I>c</I> with one rounding error instead of two. The DEC Alpha's default (fast)
mode is to flush underflowed values to zero instead of returning 
subnormal numbers, which is the default demanded by the IEEE standard;
<A NAME="4442">&#160;</A>
<A NAME="4443">&#160;</A>
in this mode the DEC Alpha aborts if it encounters a subnormal number.
In all these cases machines may be made to operate absolutely identically,
for example, by rounding all intermediate results back to single or double on 
an Intel machine, or by doing subnormal arithmetic carefully and slowly
on a DEC Alpha.
These heterogeneities lead to errors encountered only in parallel computing;
see section&nbsp;<A HREF="node134.html#sec_Hetero">6.2</A> 
for further discussion.
<P>
<P><A NAME="4446">&#160;</A><A NAME="tabIEEEvalues">&#160;</A><IMG WIDTH=701 HEIGHT=155 ALIGN=BOTTOM ALT="table4445" SRC="img438.gif"><BR>
<STRONG>Table 6.1:</STRONG> Values of machine parameters in IEEE floating-point arithmetic<BR>
<P>
<P>
As stated above, we will ignore underflow in discussing error
bounds. Reference [<A HREF="node189.html#demmel84">37</A>] discusses extending error bounds 
to include underflow and shows that for many common computations, 
when underflow occurs, it 
is less significant than roundoff.
<P>
Overflow historically resulted in an error message and 
stopped execution,
in which case our error bounds would not apply.
But with IEEE floating-point arithmetic,
<A NAME="4470">&#160;</A> the default is that
overflow returns <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> and execution continues.
Indeed, with IEEE arithmetic machines can continue to compute
past overflows, even division by zero, square roots of negative 
numbers, etc., by producing <IMG WIDTH=29 HEIGHT=21 ALIGN=MIDDLE ALT="tex2html_wrap_inline17191" SRC="img428.gif"> and NaN 
(``Not a Number'') symbols according to special rules of arithmetic.
The default on many systems is to continue computing with these 
symbols. Routine PxLAIECT exploits this arithmetic to accelerate
the computations of eigenvalues, as discussed above.
It is also possible to stop with an error message when overflow occurs,
a feature that is often useful for debugging.
The user should consult the system manual to see how to turn 
error messages on or off.
<A NAME="4471">&#160;</A>
<A NAME="4472">&#160;</A>
<A NAME="4473">&#160;</A>
<A NAME="4474">&#160;</A>
<A NAME="4475">&#160;</A>
<A NAME="4476">&#160;</A>
<A NAME="4477">&#160;</A>
<A NAME="4478">&#160;</A>
<A NAME="4479">&#160;</A>
<A NAME="4480">&#160;</A>
<A NAME="4481">&#160;</A>
<A NAME="4482">&#160;</A>
<A NAME="4483">&#160;</A>
<A NAME="4484">&#160;</A>
<P>
Most of our error bounds will simply be proportional 
to relative machine precision (epsilon). This means, for example, that if the
same problem in solved in double precision and single precision, 
the error bound in double precision will be smaller than the 
error bound in single precision by a factor
of <IMG WIDTH=95 HEIGHT=28 ALIGN=MIDDLE ALT="tex2html_wrap_inline17278" SRC="img439.gif">. 
In IEEE arithmetic, this 
ratio is <IMG WIDTH=132 HEIGHT=31 ALIGN=MIDDLE ALT="tex2html_wrap_inline17280" SRC="img440.gif">,
meaning that one expects the double-precision answer 
to have approximately nine more decimal digits correct 
than the single-precision answer.
<P>
Like their counterparts in LAPACK. 
ScaLAPACK routines are generally insensitive to the details of rounding, 
provided all processes perform arithmetic identically.
The one exception is PxLAIECT, as mentioned above.
The next section discusses what can happen when processes do not perform
arithmetic identically, that is, are <EM>heterogeneous</EM>.
<P>
<HR><A NAME="tex2html3878" HREF="node134.html"><IMG WIDTH=37 HEIGHT=24 ALIGN=BOTTOM ALT="next" SRC="http://www.netlib.org/utk/icons/next_motif.gif"></A> <A NAME="tex2html3876" HREF="node132.html"><IMG WIDTH=26 HEIGHT=24 ALIGN=BOTTOM ALT="up" SRC="http://www.netlib.org/utk/icons/up_motif.gif"></A> <A NAME="tex2html3870" HREF="node132.html"><IMG WIDTH=63 HEIGHT=24 ALIGN=BOTTOM ALT="previous" SRC="http://www.netlib.org/utk/icons/previous_motif.gif"></A> <A NAME="tex2html3880" HREF="node1.html"><IMG WIDTH=65 HEIGHT=24 ALIGN=BOTTOM ALT="contents" SRC="http://www.netlib.org/utk/icons/contents_motif.gif"></A> <A NAME="tex2html3881" HREF="node190.html"><IMG WIDTH=43 HEIGHT=24 ALIGN=BOTTOM ALT="index" SRC="http://www.netlib.org/utk/icons/index_motif.gif"></A> <BR>
<B> Next:</B> <A NAME="tex2html3879" HREF="node134.html">New Sources of Error </A>
<B>Up:</B> <A NAME="tex2html3877" HREF="node132.html">Accuracy and Stability</A>
<B> Previous:</B> <A NAME="tex2html3871" HREF="node132.html">Accuracy and Stability</A>
<P><ADDRESS>
<I>Susan Blackford <BR>
Tue May 13 09:21:01 EDT 1997</I>
</ADDRESS>
</BODY>
</HTML>