1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247
|
<!DOCTYPE html
PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- saved from url=(0014)about:internet -->
<html xmlns:MSHelp="http://www.microsoft.com/MSHelp/" lang="en-us" xml:lang="en-us"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="DC.Type" content="topic">
<meta name="DC.Title" content="Simple Example: Fibonacci Numbers">
<meta name="DC.subject" content="Simple Example: Fibonacci Numbers">
<meta name="keywords" content="Simple Example: Fibonacci Numbers">
<meta name="DC.Relation" scheme="URI" content="../tbb_userguide/The_Task_Scheduler.htm">
<meta name="DC.Relation" scheme="URI" content="Scheduler_Bypass.htm#tutorial_Scheduler_Bypass">
<meta name="DC.Relation" scheme="URI" content="How_Task_Scheduling_Works.htm#tutorial_How_Task_Scheduling_Works">
<meta name="DC.Format" content="XHTML">
<meta name="DC.Identifier" content="tutorial_Simple_Example_Fibonacci_Numbers">
<link rel="stylesheet" type="text/css" href="../intel_css_styles.css">
<title>Simple Example: Fibonacci Numbers</title>
<xml>
<MSHelp:Attr Name="DocSet" Value="Intel"></MSHelp:Attr>
<MSHelp:Attr Name="Locale" Value="kbEnglish"></MSHelp:Attr>
<MSHelp:Attr Name="TopicType" Value="kbReference"></MSHelp:Attr>
</xml>
</head>
<body id="tutorial_Simple_Example_Fibonacci_Numbers">
<!-- ==============(Start:NavScript)================= -->
<script src="..\NavScript.js" language="JavaScript1.2" type="text/javascript"></script>
<script language="JavaScript1.2" type="text/javascript">WriteNavLink(1);</script>
<!-- ==============(End:NavScript)================= -->
<a name="tutorial_Simple_Example_Fibonacci_Numbers"><!-- --></a>
<h1 class="topictitle1">Simple Example: Fibonacci Numbers</h1>
<div>
<p>This section uses computation of the
<var>n</var>th Fibonacci number as an example. This example uses
an inefficient method to compute Fibonacci numbers, but it demonstrates the
basics of a task library using a simple recursive pattern. To get scalable
speedup out of task-based programming, you need to specify a lot of tasks. This
is typically done in Intel® TBB with a recursive task pattern.
</p>
<p>This is the serial code:
</p>
<pre>long SerialFib( long n ) {
if( n<2 )
return n;
else
return SerialFib(n-1)+SerialFib(n-2);
}</pre>
<p>The top-level code for the parallel task-based version is:
</p>
<pre>long ParallelFib( long n ) {
long sum;
FibTask& a = *new(task::allocate_root()) FibTask(n,&sum);
task::spawn_root_and_wait(a);
return sum;
}</pre>
<p>This code uses a task of type
<samp class="codeph">FibTask</samp> to do the real work. It involves the following
distinct steps:
</p>
<ol>
<li>
<p>Allocate space for the task. This is done by a special "overloaded
new" and method
<span class="option">task::allocate_root</span>. The
<samp class="codeph">_root</samp> suffix in the name denotes the fact that the task
created has no parent. It is the root of a task tree. Tasks must be allocated
by special methods so that the space can be efficiently recycled when the task
completes.
</p>
</li>
<li>
<p>Construct the task with the constructor
<samp class="codeph">FibTask(n,&sum)</samp> invoked by
<samp class="codeph">new</samp>. When the task is run in step 3, it computes the
nth Fibonacci number and stores it into
<samp class="codeph">*sum</samp>.
</p>
</li>
<li>
<p>Run the task to completion with
<samp class="codeph">task::spawn_root_and_wait</samp>.
</p>
</li>
</ol>
<p>The real work is inside struct
<samp class="codeph">FibTask</samp>. Its definition is shown below.
</p>
<pre>class FibTask: public task {
public:
const long n;
long* const sum;
FibTask( long n_, long* sum_ ) :
n(n_), sum(sum_)
{}
task* execute() { // Overrides virtual function task::execute
if( n<CutOff ) {
*sum = SerialFib(n);
} else {
long x, y;
FibTask& a = *new( allocate_child() ) FibTask(n-1,&x);
FibTask& b = *new( allocate_child() ) FibTask(n-2,&y);
// Set ref_count to 'two children plus one for the wait".
set_ref_count(3);
// Start b running.
spawn( b );
// Start a running and wait for all children (a and b).
spawn_and_wait_for_all(a);
// Do the sum
*sum = x+y;
}
return NULL;
}
};</pre>
<p>It is a relatively large piece of code, compared to
<samp class="codeph">SerialFib</samp>, because it expresses parallelism without the
help of any extensions to standard C++.
</p>
<p>Like all tasks scheduled by Intel® TBB,
<samp class="codeph">FibTask</samp> is derived from class
<samp class="codeph">task</samp>. Fields
<samp class="codeph">n</samp> and
<samp class="codeph">sum</samp> hold respectively the input value and pointer to the
output. These are copies of the arguments passed to the constructor for
<samp class="codeph">FibTask</samp>. Method
<samp class="codeph">execute</samp> does the actual computation. Every task must
provide a definition of
<samp class="codeph">execute</samp> that overrides the pure virtual method
<span class="option">task::execute</span>. The definition should do the work of the
task, and return either NULL, or a pointer to the next task to run. In this
simple example, it returns NULL. For more information on the non-NULL case see
<strong>Scheduler Bypass</strong>.
</p>
<p>Method
<span class="option">FibTask::execute()</span>does the following:
</p>
<ul type="disc">
<li>
<p>Checks if
<samp class="codeph">n</samp> is so small that serial execution would be faster.
Finding the right value of
<samp class="codeph">CutOff</samp> requires some experimentation. A value of at
least 16 works well in practice for getting most of the possible speedup out of
this example. Resorting to a sequential algorithm when the problem size becomes
small is characteristic of most divide-and-conquer patterns for parallelism.
Finding the point at which to switch requires experimentation, so be sure to
write your code in a way that allows you to experiment.
</p>
</li>
<li>
<p>If the
<samp class="codeph">else</samp> is taken, the code creates and runs two child
tasks that compute the (<samp class="codeph">n</samp>-1)th and (<samp class="codeph">n</samp>-2)th
Fibonacci numbers. Here, inherited method
<samp class="codeph">allocate_child()</samp> is used to allocate space for the
task. Remember that the top-level routine
<samp class="codeph">ParallelFib</samp> used
<samp class="codeph">allocate_root()</samp> to allocate space for a task. The
difference is that here the task is creating
<em>child</em> tasks. This relationship is indicated by the choice of
allocation method.
</p>
</li>
<li>
<p>Calls
<samp class="codeph">set_ref_count(3)</samp>. The number
<samp class="codeph">3</samp> represents the two children and an additional
implicit reference that is required by method
<samp class="codeph">spawn_and_wait_for_all</samp>. Make sure to call
<samp class="codeph">set_reference_count(3)</samp> before spawning any children.
Failure to do so results in undefined behavior. The debug version of the
library usually detects and reports this type of error.
</p>
</li>
<li>
<p>Spawns two child tasks. Spawning a task indicates to the scheduler
that it can run the task whenever it chooses, possibly in parallel with other
tasks. For more information on the execution policy see
<strong>How Task Scheduling Works</strong>. The first spawning, by method
<samp class="codeph">spawn</samp>, returns immediately without waiting for the
child task to start executing. The second spawning, by method
<samp class="codeph">spawn_and_wait_for_all</samp>, causes the parent to wait
until all currently allocated child tasks are finished.
</p>
</li>
<li>
<p>After the two child tasks complete, the parent computes
<samp class="codeph">x+y</samp> and stores it in
<samp class="codeph">*sum</samp>.
</p>
</li>
</ul>
<p>At first glance, the parallelism might appear to be limited, because the
task creates only two child tasks. The trick here is
<em>recursive parallelism.</em> The two child tasks each create two child
tasks, and so on, until
<samp class="codeph">n<Cutoff</samp>. This chain reaction creates a lot of
potential parallelism. The advantage of the task scheduler is that it turns
this potential parallelism into real parallelism in a very efficient way,
because it chooses tasks to run in a way that keeps physical threads busy with
relatively little context switching.
</p>
</div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="../tbb_userguide/The_Task_Scheduler.htm">The Task Scheduler</a></div>
</div>
<div class="See Also">
<h2>See Also</h2>
<div class="linklist">
<div><a href="Scheduler_Bypass.htm#tutorial_Scheduler_Bypass">Scheduler Bypass
</a></div>
<div><a href="How_Task_Scheduling_Works.htm#tutorial_How_Task_Scheduling_Works">How Task Scheduling Works
</a></div></div>
</div>
</body>
</html>
|