1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
|
<html>
<head>
<title>Peach - Parallel Each</title>
<style>
pre {
background-color: #f1f1f3;
color: #112;
padding: 10px;
font-size: 1.1em;
overflow: auto;
margin: 4px 0px;
width: 95%;
}
/* Syntax highlighting */
pre .normal {}
pre .comment { color: #005; font-style: italic; }
pre .keyword { color: #A00; font-weight: bold; }
pre .method { color: #077; }
pre .class { color: #074; }
pre .module { color: #050; }
pre .punct { color: #447; font-weight: bold; }
pre .symbol { color: #099; }
pre .string { color: #944; background: #FFE; }
pre .char { color: #F07; }
pre .ident { color: #004; }
pre .constant { color: #07F; }
pre .regex { color: #B66; background: #FEF; }
pre .number { color: #F99; }
pre .attribute { color: #5bb; }
pre .global { color: #7FB; }
pre .expr { color: #227; }
pre .escape { color: #277; }
</style>
</head>
<body style="background-color: #ecad27;">
<center><div style="border: dotted #000 1px;
background-color: #fff;
width: 745px;
text-align: left;">
<img src="Peach.sketch.png" alt="Peach">
<div style ="padding: 0px 20px 20px 20px;">
<h1>Parallel Each <small><small>
(for ruby with threads)
</small></small></h1>
<p>
It is pretty common to have iterations over Arrays that can be safely
run in parallel. With multicore chips becoming pretty common,
single threaded processing is about as cool as Pog. Unfortunately,
standard Ruby hates real threads pretty hardcore at the present time;
however, for some ruby projects alternate VMs like
<a href="http://jruby.codehaus.org/" title="JRuby: Coolest Thing Ever">
JRuby</a> do give multicores some lovin'. <i>Peach</i> exists to
make this power simple to use with minimal code changes.
</p>
<p>Functions like <tt>map</tt>, <tt>each</tt>, and <tt>delete_if</tt>
are often used in a functional, side-effect free style. If the
operation in the block is computationally intense, performance can
often be gained by multithreading the process. That's where
<i>Peach</i> comes in. In the simplest case, you are one letter away
from harnessing the power of parallelism and unlocking the secret of
a guilt-free tan. At this stage, the goggles are purely optional.
</p>
<h2>Using Peach</h2>
<p>Suppose you are going about your day job hacking away at code for
the <a href="http://en.wikipedia.org/wiki/WOPR">WOPR</a> when you
stumble upon the code:
</p>
<pre><span class=ident>cities</span><span class=punct>.</span><span class=ident>each</span> <span class=punct>{|</span><span class=ident>city</span><span class=punct>|</span> <span class=ident>thermonuclear_war</span><span class=punct>(</span><span class=ident>city</span><span class=punct>)}</span>
</pre>
<p>Clearly, the only winning move is to declare war in parallel. With
<i>Peach</i>, the new code is:
<pre>
<span class="ident">require</span> <span class="punct">'</span><span class="string">peach</span><span class="punct">'</span>
<span class=ident>cities</span><span class=punct>.</span><span class=ident>peach</span> <span class=punct>{|</span><span class=ident>city</span><span class=punct>|</span> <span class=ident>thermonuclear_war</span><span class=punct>(</span><span class=ident>city</span><span class=punct>)}</span>
</pre>
<p>
Requiring peach.rb monkey patches Array into submission.
Currently <i>Peach</i> provides <tt>peach</tt>, <tt>pmap</tt>, and
<tt>pdelete_if</tt>. Each of these functions takes an optional
argument <i>n</i>, which represents the desired number of worker
threads with the default being one thread per Array element. For
cheaper operations on a large number of elements, you probably want
to set <i>n</i> to something reasonably low.
</p>
<pre><span class="punct">(</span><span class="number">0</span><span class="punct">...</span><span class="number">10000</span><span class="punct">).</span><span class="ident">to_a</span><span class="punct">.</span><span class="ident">pmap</span><span class="punct">(</span><span class="number">4</span><span class="punct">)</span> <span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">process</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span>
</pre>
<p>
Constructing the threads and adding on a few layers of indirection does
add a bit of overhead to the iteration especially on MRI. Keep this in
mind and remember to benchmark when unsure.
<h3>Syntax (without all the words)</h3>
<pre><span class="ident">require</span> <span class="punct">'</span><span class="string">peach</span><span class="punct">'</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">peach</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 4 threads, => [1,2,3,4]</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pmap</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 4 threads, => [f(1),f(2),f(3),f(4)]</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pdelete_if</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">x</span> <span class="punct">></span> <span class="number">2</span><span class="punct">}</span> <span class="comment">#Spawns 4 threads, => [3,4]</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">peach</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 2 threads, => [1,2,3,4]</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pmap</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 2 threads, => [f(1),f(2),f(3),f(4)]</span>
<span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pdelete_if</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">x</span> <span class="punct">></span> <span class="number">2</span><span class="punct">}</span> <span class="comment">#Spawns 2 threads, => [3,4]</span>
</pre>
<h2>FAQ</h2>
<p><b>Q: I use normal ruby (MRI 1.8 or 1.9), will Peach confer superpowers and great performance upon my code?</b><br/>
A: No, on MRI your code will be slightly slower because of the increased overhead for Thread creation. MRI is singlethreaded so Peach will not make it magically parallel.</p>
<p><b>Q: Why should I switch to JRuby to get the benefits of Peach?</b><br/>
A: Switching to JRuby for code that needs better performance is a good idea even without Peach. JRuby is insanely fast and a good idea. The multithreading and ability to use this humble utility is just another feature.</p>
<p><b>Q: Benchmarks?</b><br/>
A: I am pretty bad at benchmarking code, but I do have a simple test comparing performance between <tt>map</tt> and <tt>pmap</tt> on MRI and JRuby. Headius helped in the preparation of these materials. <a href="http://pastie.caboo.se/177240">Check it out</a>. JRuby on Java 1.6 is <a href="http://pastie.caboo.se/177263">even faster</a>. If you come up with any benchmarks, do let me know.</p>
<h2>Eat a Peach</h2>
<p>
<i>Peach</i> is distributed as a gem from github, so:<br/>
<tt>gem install schleyfox-peach --source=http://gems.github.com</tt>.
<ul>
<li>Project Page: <a href="http://rubyforge.org/projects/peach/">
RubyForge</a>, <a href="http://github.com/schleyfox/peach">Github</a></li>
</ul>
</p>
</div>
</div></center>
</body>
</html>
|