1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344
|
% \VignetteIndexEntry{Getting Started with doParallel and foreach}
% \VignetteDepends{doParallel}
% \VignetteDepends{foreach}
% \VignettePackage{doParallel}
\documentclass[12pt]{article}
\usepackage{amsmath}
\usepackage[pdftex]{graphicx}
\usepackage{color}
\usepackage{xspace}
\usepackage{url}
\usepackage{fancyvrb}
\usepackage{fancyhdr}
\usepackage[
colorlinks=true,
linkcolor=blue,
citecolor=blue,
urlcolor=blue]
{hyperref}
\usepackage{lscape}
\usepackage{Sweave}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% define new colors for use
\definecolor{darkgreen}{rgb}{0,0.6,0}
\definecolor{darkred}{rgb}{0.6,0.0,0}
\definecolor{lightbrown}{rgb}{1,0.9,0.8}
\definecolor{brown}{rgb}{0.6,0.3,0.3}
\definecolor{darkblue}{rgb}{0,0,0.8}
\definecolor{darkmagenta}{rgb}{0.5,0,0.5}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newcommand{\bld}[1]{\mbox{\boldmath $#1$}}
\newcommand{\shell}[1]{\mbox{$#1$}}
\renewcommand{\vec}[1]{\mbox{\bf {#1}}}
\newcommand{\ReallySmallSpacing}{\renewcommand{\baselinestretch}{.6}\Large\normalsize}
\newcommand{\SmallSpacing}{\renewcommand{\baselinestretch}{1.1}\Large\normalsize}
\newcommand{\halfs}{\frac{1}{2}}
\setlength{\oddsidemargin}{-.25 truein}
\setlength{\evensidemargin}{0truein}
\setlength{\topmargin}{-0.2truein}
\setlength{\textwidth}{7 truein}
\setlength{\textheight}{8.5 truein}
\setlength{\parindent}{0.20truein}
\setlength{\parskip}{0.10truein}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\pagestyle{fancy}
\lhead{}
\chead{Getting Started with doParallel and foreach}
\rhead{}
\lfoot{}
\cfoot{}
\rfoot{\thepage}
\renewcommand{\headrulewidth}{1pt}
\renewcommand{\footrulewidth}{1pt}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\title{Getting Started with doParallel and foreach}
\author{Steve Weston\footnote{Steve Weston wrote the original version of this vignette for the doMC package. Rich Calaway
adapted the vignette for doParallel.} and Rich Calaway \\ doc@revolutionanalytics.com}
\begin{document}
\maketitle
\thispagestyle{empty}
\section{Introduction}
The \texttt{doParallel} package is a ``parallel backend'' for the
\texttt{foreach} package. It provides a mechanism needed to execute
\texttt{foreach} loops in parallel. The \texttt{foreach} package must
be used in conjunction with a package such as \texttt{doParallel} in order to
execute code in parallel. The user must register a parallel backend to
use, otherwise \texttt{foreach} will execute tasks sequentially, even
when the \%dopar\% operator is used.\footnote{\texttt{foreach} will
issue a warning that it is running sequentially if no parallel backend
has been registered. It will only issue this warning once, however.}
The \texttt{doParallel} package acts as an interface between \texttt{foreach}
and the \texttt{parallel} package of R 2.14.0 and later. The \texttt{parallel}
package is essentially a merger of the \texttt{multicore} package, which was
written by Simon Urbanek, and the \texttt{snow} package, which was written
by Luke Tierney and others. The \texttt{multicore} functionality supports
multiple workers only on those operating systems that
support the \texttt{fork} system call; this excludes Windows. By default,
\texttt{doParallel} uses \texttt{multicore} functionality on Unix-like
systems and \texttt{snow} functionality on Windows. Note that
the \texttt{multicore} functionality only runs tasks on a single
computer, not a cluster of computers. However, you can use the
\texttt{snow} functionality to execute on a cluster, using Unix-like
operating systems, Windows, or even a combination.
It is pointless to use \texttt{doParallel} and \texttt{parallel}
on a machine with only one processor with a single core. To get a speed
improvement, it must run on a machine with multiple processors, multiple
cores, or both.
\section{A word of caution}
Because the \texttt{parallel} package in \texttt{multicore} mode
starts its workers using
\texttt{fork} without doing a subsequent \texttt{exec}, it has some
limitations. Some operations cannot be performed properly by forked
processes. For example, connection objects very likely won't work.
In some cases, this could cause an object to become corrupted, and
the R session to crash.
\section{Registering the \texttt{doParallel} parallel backend}
To register \texttt{doParallel} to be used with \texttt{foreach}, you must
call the \texttt{registerDoParallel} function. If you call this with no
arguments, on Windows you will get three workers and on Unix-like
systems you will get a number of workers equal to approximately half the
number of cores on your system. You can also specify a cluster
(as created by the \texttt{makeCluster} function) or a number of cores.
The \texttt{cores} argument specifies the number of worker
processes that \texttt{doParallel} will use to execute tasks, which will
by default be
equal to one-half the total number of cores on the machine. You don't need to
specify a value for it, however. By default, \texttt{doParallel} will use the
value of the ``cores'' option, as specified with
the standard ``options'' function. If that isn't set, then
\texttt{doParallel} will try to detect the number of cores, and use one-half
that many workers.
Remember: unless \texttt{registerDoMC} is called, \texttt{foreach} will
{\em not} run in parallel. Simply loading the \texttt{doParallel} package is
not enough.
\section{An example \texttt{doParallel} session}
Before we go any further, let's load \texttt{doParallel}, register it, and use
it with \texttt{foreach}. We will use \texttt{snow}-like functionality in this
vignette, so we start by loading the package and starting a cluster:
<<loadLibs>>=
library(doParallel)
cl <- makeCluster(2)
registerDoParallel(cl)
foreach(i=1:3) %dopar% sqrt(i)
@
<<echo=FALSE>>=
stopCluster(cl)
@
To use \texttt{multicore}-like functionality, we would specify the number
of cores to use instead (but note that on Windows, attempting to use more
than one core with \texttt{parallel} results in an error):
\begin{verbatim}
library(doParallel}
registerDoParallel(cores=2)
foreach(i=1:3) %dopar% sqrt(i)
\end{verbatim}
\begin{quote}
Note well that this is {\em not} a practical use of \texttt{doParallel}. This
is our ``Hello, world'' program for parallel computing. It tests that
everything is installed and set up properly, but don't expect it to run
faster than a sequential \texttt{for} loop, because it won't!
\texttt{sqrt} executes far too quickly to be worth executing in
parallel, even with a large number of iterations. With small tasks, the
overhead of scheduling the task and returning the result can be greater
than the time to execute the task itself, resulting in poor performance.
In addition, this example doesn't make use of the vector capabilities of
\texttt{sqrt}, which it must to get decent performance. This is just a
test and a pedagogical example, {\em not} a benchmark.
\end{quote}
But returning to the point of this example, you can see that it is very
simple to load \texttt{doParallel} with all of its dependencies
(\texttt{foreach}, \texttt{iterators}, \texttt{parallel}, etc), and to
register it. For the rest of the R session, whenever you execute
\texttt{foreach} with \texttt{\%dopar\%}, the tasks will be executed
using \texttt{doParallel} and \texttt{parallel}. Note that you can register
a different parallel backend later, or deregister \texttt{doParallel} by
registering the sequential backend by calling the \texttt{registerDoSEQ}
function.
\section{A more serious example}
Now that we've gotten our feet wet, let's do something a bit less
trivial. One good example is bootstrapping. Let's see how long it
takes to run 10,000 bootstrap iterations in parallel on
\Sexpr{getDoParWorkers()} cores:
<<echo=FALSE>>=
library(doParallel)
cl <- makeCluster(2)
registerDoParallel(cl)
@
<<bootpar>>=
x <- iris[which(iris[,5] != "setosa"), c(1,5)]
trials <- 10000
ptime <- system.time({
r <- foreach(icount(trials), .combine=cbind) %dopar% {
ind <- sample(100, 100, replace=TRUE)
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
coefficients(result1)
}
})[3]
ptime
@
Using \texttt{doParallel} and \texttt{parallel} we were able to perform
10,000 bootstrap iterations in \Sexpr{ptime} seconds on
\Sexpr{getDoParWorkers()} cores. By changing the \texttt{\%dopar\%} to
\texttt{\%do\%}, we can run the same code sequentially to determine the
performance improvement:
<<bootseq>>=
stime <- system.time({
r <- foreach(icount(trials), .combine=cbind) %do% {
ind <- sample(100, 100, replace=TRUE)
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
coefficients(result1)
}
})[3]
stime
@
The sequential version ran in \Sexpr{stime} seconds, which means the
speed up is about \Sexpr{round(stime / ptime, digits=1)} on
\Sexpr{getDoParWorkers()} workers.\footnote{If you build this vignette
yourself, you can see how well this problem runs on your hardware. None
of the times are hardcoded in this document. You can also run the same
example which is in the examples directory of the \texttt{doParallel}
distribution.} Ideally, the speed up would be \Sexpr{getDoParWorkers()},
but no multicore CPUs are ideal, and neither are the operating systems
and software that run on them.
At any rate, this is a more realistic example that is worth executing in
parallel. We do not explain what it's doing or how it works
here. We just want to give you something more substantial than the
\texttt{sqrt} example in case you want to run some benchmarks yourself.
You can also run this example on a cluster by simply reregistering
with a cluster object that specifies the nodes to use. (See the
\texttt{makeCluster} help file for more details.)
\section{Getting information about the parallel backend}
To find out how many workers \texttt{foreach} is going to use, you can
use the \texttt{getDoParWorkers} function:
<<getDoParWorkers>>=
getDoParWorkers()
@
This is a useful sanity check that you're actually running in parallel.
If you haven't registered a parallel backend, or if your machine only
has one core, \texttt{getDoParWorkers} will return one. In either case,
don't expect a speed improvement. \texttt{foreach} is clever, but it
isn't magic.
The \texttt{getDoParWorkers} function is also useful when you want the
number of tasks to be equal to the number of workers. You may want to
pass this value to an iterator constructor, for example.
You can also get the name and version of the currently registered
backend:
<<getDoParName>>=
getDoParName()
getDoParVersion()
@
<<echo=FALSE>>=
stopCluster(cl)
@
This is mostly useful for documentation purposes, or for checking that
you have the most recent version of \texttt{doParallel}.
\section{Specifying multicore options}
When using \texttt{multicore}-like functionality, the \texttt{doParallel} package allows
you to specify various options when
running \texttt{foreach} that are supported by the underlying
\texttt{mclapply} function: ``preschedule'', ``set.seed'', ``silent'',
and ``cores''. You can learn about these options from the
\texttt{mclapply} man page. They are set using the \texttt{foreach}
\texttt{.options.multicore} argument. Here's an example of how to do
that:
\begin{verbatim}
mcoptions <- list(preschedule=FALSE, set.seed=FALSE)
foreach(i=1:3, .options.multicore=mcoptions) %dopar% sqrt(i)
\end{verbatim}
The ``cores'' options allows you to temporarily override the number of
workers to use for a single \texttt{foreach} operation. This is more
convenient than having to re-register \texttt{doParallel}. Although if no
value of ``cores'' was specified when \texttt{doParallel} was registered, you
can also change this value dynamically using the \texttt{options}
function:
\begin{verbatim}
options(cores=2)
getDoParWorkers()
options(cores=3)
getDoParWorkers()
\end{verbatim}
If you did specify the number of cores when registering \texttt{doParallel},
the ``cores'' option is ignored:
\begin{verbatim}
registerDoParallel(4)
options(cores=2)
getDoParWorkers()
\end{verbatim}
As you can see, there are a number of options for controlling the number
of workers to use with \texttt{parallel}, but the default behaviour
usually does what you want.
\section{Stopping your cluster}
If you are using \texttt{snow}-like functionality, you will want to stop your
cluster when you are done using it. The \texttt{doParallel} package's
\texttt{.onUnload} function will do this automatically if the cluster was created
automatically by \texttt{registerDoParallel}, but if you created the cluster manually
you should stop it using the \texttt{stopCluster} function:
\begin{verbatim}
stopCluster(cl)
\end{verbatim}
\section{Conclusion}
The \texttt{doParallel} and \texttt{parallel} packages provide a nice,
efficient parallel programming platform for multiprocessor/multicore
computers running operating systems such as Linux and Mac OS X. It is
very easy to install, and very easy to use. In short order, an average
R programmer can start executing parallel programs, without any previous
experience in parallel computing.
\end{document}
|