1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
|
<!DOCTYPE html>
<html>
<!-- Created by GNU Texinfo 7.1.1, https://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Profiler Example (GNU Octave (version 10.3.0))</title>
<meta name="description" content="Profiler Example (GNU Octave (version 10.3.0))">
<meta name="keywords" content="Profiler Example (GNU Octave (version 10.3.0))">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link href="index.html" rel="start" title="Top">
<link href="Concept-Index.html" rel="index" title="Concept Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Debugging.html" rel="up" title="Debugging">
<link href="Profiling.html" rel="prev" title="Profiling">
<style type="text/css">
<!--
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
div.example {margin-left: 3.2em}
span:hover a.copiable-link {visibility: visible}
-->
</style>
<link rel="stylesheet" type="text/css" href="octave.css">
</head>
<body lang="en">
<div class="section-level-extent" id="Profiler-Example">
<div class="nav-panel">
<p>
Previous: <a href="Profiling.html" accesskey="p" rel="prev">Profiling</a>, Up: <a href="Debugging.html" accesskey="u" rel="up">Debugging</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<h3 class="section" id="Profiler-Example-1"><span>13.7 Profiler Example<a class="copiable-link" href="#Profiler-Example-1"> ¶</a></span></h3>
<p>Below, we will give a short example of a profiler session.
See <a class="xref" href="Profiling.html">Profiling</a>, for the documentation of the profiler functions in
detail. Consider the code:
</p>
<div class="example">
<pre class="example-preformatted">global N A;
N = 300;
A = rand (N, N);
function xt = timesteps (steps, x0, expM)
global N;
if (steps == 0)
xt = NA (N, 0);
else
xt = NA (N, steps);
x1 = expM * x0;
xt(:, 1) = x1;
xt(:, 2 : end) = timesteps (steps - 1, x1, expM);
endif
endfunction
function foo ()
global N A;
initial = @(x) sin (x);
x0 = (initial (linspace (0, 2 * pi, N)))';
expA = expm (A);
xt = timesteps (100, x0, expA);
endfunction
function fib = bar (N)
if (N <= 2)
fib = 1;
else
fib = bar (N - 1) + bar (N - 2);
endif
endfunction
</pre></div>
<p>If we execute the two main functions, we get:
</p>
<div class="example">
<div class="group"><pre class="example-preformatted">tic; foo; toc;
⇒ Elapsed time is 2.37338 seconds.
tic; bar (20); toc;
⇒ Elapsed time is 2.04952 seconds.
</pre></div></div>
<p>But this does not give much information about where this time is spent;
for instance, whether the single call to <code class="code">expm</code> is more expensive
or the recursive time-stepping itself. To get a more detailed picture,
we can use the profiler.
</p>
<div class="example">
<div class="group"><pre class="example-preformatted">profile on;
foo;
profile off;
data = profile ("info");
profshow (data, 10);
</pre></div></div>
<p>This prints a table like:
</p>
<div class="example">
<div class="group"><pre class="example-preformatted"> # Function Attr Time (s) Calls
---------------------------------------------
7 expm 1.034 1
3 binary * 0.823 117
41 binary \ 0.188 1
38 binary ^ 0.126 2
43 timesteps R 0.111 101
44 NA 0.029 101
39 binary + 0.024 8
34 norm 0.011 1
40 binary - 0.004 101
33 balance 0.003 1
</pre></div></div>
<p>The entries are the individual functions which have been executed (only
the 10 most important ones), together with some information for each of
them. The entries like ‘<samp class="samp">binary *</samp>’ denote operators, while other
entries are ordinary functions. They include both built-ins like
<code class="code">expm</code> and our own routines (for instance <code class="code">timesteps</code>). From
this profile, we can immediately deduce that <code class="code">expm</code> uses up the
largest proportion of the processing time, even though it is only called
once. The second expensive operation is the matrix-vector product in the
routine <code class="code">timesteps</code>. <a class="footnote" id="DOCF6" href="#FOOT6"><sup>6</sup></a>
</p>
<p>Timing, however, is not the only information available from the profile.
The attribute column shows us that <code class="code">timesteps</code> calls itself
recursively. This may not be that remarkable in this example (since it’s
clear anyway), but could be helpful in a more complex setting. As to the
question of why is there a ‘<samp class="samp">binary \</samp>’ in the output, we can easily
shed some light on that too. Note that <code class="code">data</code> is a structure array
(<a class="ref" href="Structure-Arrays.html">Structure Arrays</a>) which contains the field <code class="code">FunctionTable</code>.
This stores the raw data for the profile shown. The number in the first
column of the table gives the index under which the shown function can
be found there. Looking up <code class="code">data.FunctionTable(41)</code> gives:
</p>
<div class="example">
<div class="group"><pre class="example-preformatted"> scalar structure containing the fields:
FunctionName = binary \
TotalTime = 0.18765
NumCalls = 1
IsRecursive = 0
Parents = 7
Children = [](1x0)
</pre></div></div>
<p>Here we see the information from the table again, but have additional
fields <code class="code">Parents</code> and <code class="code">Children</code>. Those are both arrays, which
contain the indices of functions which have directly called the function
in question (which is entry 7, <code class="code">expm</code>, in this case) or been called
by it (no functions). Hence, the backslash operator has been used
internally by <code class="code">expm</code>.
</p>
<p>Now let’s take a look at <code class="code">bar</code>. For this, we start a fresh
profiling session (<code class="code">profile on</code> does this; the old data is removed
before the profiler is restarted):
</p>
<div class="example">
<div class="group"><pre class="example-preformatted">profile on;
bar (20);
profile off;
profshow (profile ("info"));
</pre></div></div>
<p>This gives:
</p>
<div class="example">
<div class="group"><pre class="example-preformatted"> # Function Attr Time (s) Calls
-------------------------------------------------------
1 bar R 2.091 13529
2 binary <= 0.062 13529
3 binary - 0.042 13528
4 binary + 0.023 6764
5 profile 0.000 1
8 false 0.000 1
6 nargin 0.000 1
7 binary != 0.000 1
9 __profiler_enable__ 0.000 1
</pre></div></div>
<p>Unsurprisingly, <code class="code">bar</code> is also recursive. It has been called 13,529
times in the course of recursively calculating the Fibonacci number in a
suboptimal way, and most of the time was spent in <code class="code">bar</code> itself.
</p>
<p>Finally, let’s say we want to profile the execution of both <code class="code">foo</code>
and <code class="code">bar</code> together. Since we already have the run-time data
collected for <code class="code">bar</code>, we can restart the profiler without clearing
the existing data and collect the missing statistics about <code class="code">foo</code>.
This is done by:
</p>
<div class="example">
<div class="group"><pre class="example-preformatted">profile resume;
foo;
profile off;
profshow (profile ("info"), 10);
</pre></div></div>
<p>As you can see in the table below, now we have both profiles mixed
together.
</p>
<div class="example">
<div class="group"><pre class="example-preformatted"> # Function Attr Time (s) Calls
---------------------------------------------
1 bar R 2.091 13529
16 expm 1.122 1
12 binary * 0.798 117
46 binary \ 0.185 1
45 binary ^ 0.124 2
48 timesteps R 0.115 101
2 binary <= 0.062 13529
3 binary - 0.045 13629
4 binary + 0.041 6772
49 NA 0.036 101
</pre></div></div>
</div>
<div class="footnotes-segment">
<hr>
<h4 class="footnotes-heading">Footnotes</h4>
<h5 class="footnote-body-heading"><a id="FOOT6" href="#DOCF6">(6)</a></h5>
<p>We only know it is the binary
multiplication operator, but fortunately this operator appears only at
one place in the code and thus we know which occurrence takes so much
time. If there were multiple places, we would have to use the
hierarchical profile to find out the exact place which uses up the time
which is not covered in this example.</p>
</div>
<hr>
<div class="nav-panel">
<p>
Previous: <a href="Profiling.html">Profiling</a>, Up: <a href="Debugging.html">Debugging</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
</body>
</html>
|