File: node123.html

package info (click to toggle)
scalapack-doc 1.5-11
  • links: PTS
  • area: main
  • in suites: bullseye, buster, stretch
  • size: 10,336 kB
  • ctags: 4,931
  • sloc: makefile: 47; sh: 18
file content (84 lines) | stat: -rw-r--r-- 4,009 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<!--Converted with LaTeX2HTML 96.1-h (September 30, 1996) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds -->
<HTML>
<HEAD>
<TITLE>Estimate Execution Time</TITLE>
<META NAME="description" CONTENT="Estimate Execution Time">
<META NAME="keywords" CONTENT="slug">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<LINK REL=STYLESHEET HREF="slug.css">
</HEAD>
<BODY LANG="EN" >
 <A NAME="tex2html3749" HREF="node124.html"><IMG WIDTH=37 HEIGHT=24 ALIGN=BOTTOM ALT="next" SRC="http://www.netlib.org/utk/icons/next_motif.gif"></A> <A NAME="tex2html3747" HREF="node120.html"><IMG WIDTH=26 HEIGHT=24 ALIGN=BOTTOM ALT="up" SRC="http://www.netlib.org/utk/icons/up_motif.gif"></A> <A NAME="tex2html3741" HREF="node122.html"><IMG WIDTH=63 HEIGHT=24 ALIGN=BOTTOM ALT="previous" SRC="http://www.netlib.org/utk/icons/previous_motif.gif"></A> <A NAME="tex2html3751" HREF="node1.html"><IMG WIDTH=65 HEIGHT=24 ALIGN=BOTTOM ALT="contents" SRC="http://www.netlib.org/utk/icons/contents_motif.gif"></A> <A NAME="tex2html3752" HREF="node190.html"><IMG WIDTH=43 HEIGHT=24 ALIGN=BOTTOM ALT="index" SRC="http://www.netlib.org/utk/icons/index_motif.gif"></A> <BR>
<B> Next:</B> <A NAME="tex2html3750" HREF="node124.html">Determine Whether Reasonable Performance </A>
<B>Up:</B> <A NAME="tex2html3748" HREF="node120.html">Performance Evaluation</A>
<B> Previous:</B> <A NAME="tex2html3742" HREF="node122.html">Checking the BLAS and </A>
<BR> <P>
<H2><A NAME="SECTION04533000000000000000">Estimate Execution Time</A></H2>
           <A NAME="subsecestim">&#160;</A>
<P>
This section describes how one can estimate the execution
time of a ScaLAPACK routine on a given platform,
using 
Equation&nbsp;<A HREF="node112.html#eqntim">5.1</A> and the values provided
in table&nbsp;<A HREF="node114.html#tabblacs">5.5</A> and table&nbsp;<A HREF="node116.html#standardflopcount">5.8</A>.
 By comparing
this estimate with experimental data, the user can determine whether
reasonable performance has been achieved and can (possibly) identify the
performance bottlenecks, if any.
<P>
For linear system 
solvers, the estimate typically is
accurate to within 50% 
for moderate-sized problems (i.e., 160,000 or more
matrix elements per node).
For eigensolvers,
the estimate may be low by a factor of 2 
for moderate-sized problems and by more than that for smaller
problems.  The eigensolvers take longer because they involve
matrix-vector flops, as well as matrix-matrix flops,
and involve
substantial numbers of o(<IMG WIDTH=22 HEIGHT=15 ALIGN=BOTTOM ALT="tex2html_wrap_inline17067" SRC="img417.gif">) flops that are not
included in the approximation.
The accuracy of performance estimates increases with the problem size.
Unfortunately,
because ScaLAPACK eigensolvers
require more memory than the other ScaLAPACK drivers,
large problems cannot be solved; hence, execution times
for small and medium-sized problems (rather than medium-sized 
and large problems) are reported.
<P>
<P><A NAME="4185">&#160;</A><A NAME="tabestperf">&#160;</A><IMG WIDTH=714 HEIGHT=228 ALIGN=BOTTOM ALT="table4184" SRC="img418.gif"><BR>
<STRONG>Table 5.16:</STRONG> Estimated (Est) versus obtained (Obt) Mflop/s rates of PDGESV
          and PDPOSV on <I>P</I> nodes of the IBM SP2 computer for matrices of
          order <I>N</I> and a block size (<I>NB</I>) equal to 50<BR>
<P>
<P>
Table&nbsp;<A HREF="node123.html#tabestperf">5.16</A>
shows the estimated
versus obtained Mflop/s
rates for two ScaLAPACK
driver routines solving
linear systems of
equations on the IBM
Scalable POWERparallel&nbsp;2
computer. The results 
show that for these 
drivers the estimated
execution times are
within approximately
35&nbsp;% of the experimental
data on the SP2.
(The estimated times for 
the symmetric eigensolvers 
and SVD codes would not be
as accurate.)
<P>
<BR> <HR>
<P><ADDRESS>
<I>Susan Blackford <BR>
Tue May 13 09:21:01 EDT 1997</I>
</ADDRESS>
</BODY>
</HTML>