File: parallelly-20-limit-workers.html

package info (click to toggle)
r-cran-parallelly 1.42.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 1,216 kB
  • sloc: ansic: 111; sh: 13; makefile: 2
file content (175 lines) | stat: -rw-r--r-- 6,117 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Parallel Workers with CPU and Memory Limited</title>
<style>
body {
  font-family: sans-serif;
  line-height: 1.6;
  padding-left: 3ex;
  padding-right: 3ex;
  background-color: white;
  color: black;
}

a {
  color: #4183C4;
  text-decoration: none;
}

h1, h2, h3 {
  margin: 2ex 0 1ex;
  padding: 0;
  font-weight: bold;
  -webkit-font-smoothing: antialiased;
  cursor: text;
  position: relative;
}

h2 {
  border-bottom: 1px solid #cccccc;
}

code {
  margin: 0 2px;
  padding: 0 5px;
  white-space: nowrap;
  border: 1px solid #eaeaea;
  background-color: #f8f8f8;
  border-radius: 3px;
}

pre code {
  margin: 0;
  padding: 0;
  white-space: pre;
  border: none;
  background: transparent;
}

pre {
  background-color: #f8f8f8;
  border: 1px solid #cccccc;
  line-height: 2.5x;
  overflow: auto;
  padding: 0.6ex 1ex;
  border-radius: 3px;
}

pre code {
  background-color: transparent;
  border: none;
}
</style>
</head>
<body>
<h1>Parallel Workers with CPU and Memory Limited</h1>
<!--
%\VignetteIndexEntry{Parallel Workers with CPU and Memory Limited}
%\VignetteAuthor{Henrik Bengtsson}
%\VignetteKeyword{R}
%\VignetteKeyword{package}
%\VignetteKeyword{vignette}
%\VignetteEngine{parallelly::selfonly}
-->
<h1>Introduction</h1>
<p>This vignette gives examples how to restrict CPU and memory usage of
parallel workers. This can useful for optimizing the performance of
the parallel workers, but also lower the risk that they overuse the
CPU and memory on the machines they are running on.</p>
<h1>Examples</h1>
<h2>Example: Linux parallel workers with a lower process priority (“nice”)</h2>
<p>On Unix, we can run any process with a lower CPU priority using the
<code>nice</code> command. This can be used when we want to lower the risk of
negatively affecting other users and processes that run on the same
machine from our R workers overusing the CPUs by mistake. To achieve
this, we can prepend <code>nice</code> to the <code>Rscript</code> call via the <code>rscript</code>
argument using. This works both on local and remote Linux machines,
e.g.</p>
<pre><code class="language-r">library(parallelly)
cl &lt;- makeClusterPSOCK(2, rscript = c(&quot;nice&quot;, &quot;*&quot;))
</code></pre>
<pre><code class="language-r">library(parallelly)
workers &lt;- rep(&quot;n1.remote.org&quot;, 2)
cl &lt;- makeClusterPSOCK(2, rscript = c(&quot;nice&quot;, &quot;*&quot;))
</code></pre>
<p>The special <code>*</code> value expands to the proper <code>Rscript</code> on the machine
where the parallel workers are launched.</p>
<h2>Example: Linux parallel workers CPU and memory limited by CGroups</h2>
<p>This example launches two parallel workers each limited to 100% CPU
quota and 50 MiB of memory using Linux CGroups management. The 100%
CPU quota limit constrain each worker to use at most one CPU worth of
processing preventing them from overusing the machine, e.g.  through
unintended nested parallelization. For more details, see <code>man systemd.resource-control</code>.</p>
<pre><code class="language-r">library(parallelly)
cl &lt;- makeClusterPSOCK(
  2L,
  rscript = c(
    &quot;systemd-run&quot;, &quot;--user&quot;, &quot;--scope&quot;,
    &quot;-p&quot;, &quot;CPUQuota=100%&quot;,
    &quot;-p&quot;, &quot;MemoryMax=50M&quot;, &quot;-p&quot;, &quot;MemorySwapMax=50M&quot;,
    &quot;*&quot;
  )
)
</code></pre>
<p>Note, depending on your CGroups configuration, a non-privileged user
may or may not be able to set the CPU quota. If not, the <code>-p CPUQuota=100%</code> will be silently ignored.</p>
<p>The 50 MiB memory limit is strict - if a worker use more than this,
the operating system will terminate the worker instantly. To
illustrate what happens, we first start by generating 1 million
numeric values each consuming 8 bytes, which in total consumes ~8 MB,
and then calculate the mean, the memory consumption is within 50-MiB
memory limit that each parallel worker has available;</p>
<pre><code class="language-r">library(parallel)
mu &lt;- clusterEvalQ(cl, { x &lt;- rnorm(n = 1e6); mean(x) })
mu &lt;- unlist(mu)
print(mu)
#&gt; [1]  0.0008072657 -0.0019693992
</code></pre>
<p>However, if we generate 10 times more values, the memory consumption
will grow to at least 80 MB, which is over then 50-MiB memory limit,
and we will get an error:</p>
<pre><code class="language-r">mu &lt;- clusterEvalQ(cl, { x &lt;- rnorm(n = 10e6); mean(x) })
#&gt; Error in unserialize(node$con) : error reading from connection
</code></pre>
<p>This is because the operating system terminated the two background R
processes, because they overused the memory. This is why the main R
process no longer can communicate with the parallel workers.  We can
see that both workers are down, by calling:</p>
<pre><code class="language-r">isNodeAlive(cl)
#&gt; [1] FALSE FALSE
</code></pre>
<p>We can use <code>cloneNode()</code> to relaunch workers that are no longer
alive, e.g.</p>
<pre><code class="language-r">is_down &lt;- !isNodeAlive(cl)
cl[is_down] &lt;- cloneNode(cl[is_down])
isNodeAlive(cl)
#&gt; [1] TRUE TRUE
</code></pre>
<h2>Example: MS Windows parallel workers with specific CPU affinities</h2>
<p>This example, works only on MS Windows machines. It launches four
local workers, where two are running on CPU Group #0 and two on CPU
Group #1.</p>
<pre><code class="language-r">library(parallelly)
rscript &lt;- I(c(
  Sys.getenv(&quot;COMSPEC&quot;), &quot;/c&quot;, 
  &quot;start&quot;, &quot;/B&quot;,
  &quot;/NODE&quot;, cpu_group=NA_integer_, 
  &quot;/AFFINITY&quot;, &quot;0xFFFFFFFFFFFFFFFE&quot;, 
  &quot;*&quot;)
)

rscript[&quot;cpu_group&quot;] &lt;- 0
cl_0 &lt;- makeClusterPSOCK(2, rscript = rscript)

rscript[&quot;cpu_group&quot;] &lt;- 1
cl_1 &lt;- makeClusterPSOCK(2, rscript = rscript)

cl &lt;- c(cl_0, cl_1)
</code></pre>
<p>The special <code>*</code> value expands to the proper <code>Rscript</code> on the machine
where the parallel workers are launched.</p>
<!-- See also: https://lovickconsulting.com/2021/11/18/running-r-clusters-on-an-amd-threadripper-3990x-in-windows-10-2/ -->
</body>
</html>