File: limits.html

package info (click to toggle)
openmpi 5.0.7-1
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 202,312 kB
  • sloc: ansic: 612,441; makefile: 42,495; sh: 11,230; javascript: 9,244; f90: 7,052; java: 6,404; perl: 5,154; python: 1,856; lex: 740; fortran: 61; cpp: 20; tcl: 12
file content (295 lines) | stat: -rw-r--r-- 22,517 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
<!DOCTYPE html>
<html class="writer-html5" lang="en">
<head>
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>8.6. Overloading and Oversubscribing &mdash; PMIx Reference Run Time Environment 3.0.8 documentation</title>
      <link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
      <link rel="stylesheet" type="text/css" href="../_static/css/theme.css" />

  
  <!--[if lt IE 9]>
    <script src="../_static/js/html5shiv.min.js"></script>
  <![endif]-->
  
        <script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
        <script src="../_static/jquery.js"></script>
        <script src="../_static/underscore.js"></script>
        <script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
        <script src="../_static/doctools.js"></script>
        <script src="../_static/sphinx_highlight.js"></script>
    <script src="../_static/js/theme.js"></script>
    <link rel="index" title="Index" href="../genindex.html" />
    <link rel="search" title="Search" href="../search.html" />
    <link rel="next" title="8.7. Diagnostics" href="diagnostics.html" />
    <link rel="prev" title="8.5. Fundamentals" href="fundamentals.html" /> 
</head>

<body class="wy-body-for-nav"> 
  <div class="wy-grid-for-nav">
    <nav data-toggle="wy-nav-shift" class="wy-nav-side">
      <div class="wy-side-scroll">
        <div class="wy-side-nav-search" >

          
          
          <a href="../index.html" class="icon icon-home">
            PMIx Reference Run Time Environment
          </a>
<div role="search">
  <form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
    <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
    <input type="hidden" name="check_keywords" value="yes" />
    <input type="hidden" name="area" value="default" />
  </form>
</div>
        </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
              <ul class="current">
<li class="toctree-l1"><a class="reference internal" href="../quickstart.html">1. Quick start</a></li>
<li class="toctree-l1"><a class="reference internal" href="../release-notes.html">2. Release Notes</a></li>
<li class="toctree-l1"><a class="reference internal" href="../getting-help.html">3. Getting help</a></li>
<li class="toctree-l1"><a class="reference internal" href="../install.html">4. Installing PRRTE</a></li>
<li class="toctree-l1"><a class="reference internal" href="../configuration.html">5. PRRTE DVM Configuration</a></li>
<li class="toctree-l1"><a class="reference internal" href="../how-things-work/index.html">6. How Things Work</a></li>
<li class="toctree-l1"><a class="reference internal" href="../hosts/index.html">7. Host specification</a></li>
<li class="toctree-l1 current"><a class="reference internal" href="index.html">8. Process placement</a><ul class="current">
<li class="toctree-l2"><a class="reference internal" href="overview.html">8.1. Overview</a></li>
<li class="toctree-l2"><a class="reference internal" href="overview.html#definition-of-slot">8.2. Definition of ‘slot’</a></li>
<li class="toctree-l2"><a class="reference internal" href="overview.html#definition-of-processor-element">8.3. Definition of “processor element”</a></li>
<li class="toctree-l2"><a class="reference internal" href="examples.html">8.4. Examples</a></li>
<li class="toctree-l2"><a class="reference internal" href="fundamentals.html">8.5. Fundamentals</a></li>
<li class="toctree-l2 current"><a class="current reference internal" href="#">8.6. Overloading and Oversubscribing</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#overloading-vs-oversubscription-package-example">8.6.1. Overloading vs. Oversubscription: Package Example</a></li>
<li class="toctree-l3"><a class="reference internal" href="#overloading-vs-oversubscription-hardware-threads-example">8.6.2. Overloading vs. Oversubscription: Hardware Threads Example</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="diagnostics.html">8.7. Diagnostics</a></li>
<li class="toctree-l2"><a class="reference internal" href="rankfiles.html">8.8. Rankfiles</a></li>
<li class="toctree-l2"><a class="reference internal" href="deprecated.html">8.9. Deprecated options</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="../notifications.html">9. Notifications</a></li>
<li class="toctree-l1"><a class="reference internal" href="../session-directory.html">10. Session directory</a></li>
<li class="toctree-l1"><a class="reference internal" href="../developers/index.html">11. Developer’s guide</a></li>
<li class="toctree-l1"><a class="reference internal" href="../contributing.html">12. Contributing to PRRTE</a></li>
<li class="toctree-l1"><a class="reference internal" href="../license.html">13. License</a></li>
<li class="toctree-l1"><a class="reference internal" href="../man/index.html">14. PRRTE manual pages</a></li>
<li class="toctree-l1"><a class="reference internal" href="../versions.html">15. Software Version Numbers</a></li>
<li class="toctree-l1"><a class="reference internal" href="../news/index.html">16. News</a></li>
</ul>

        </div>
      </div>
    </nav>

    <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
          <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
          <a href="../index.html">PMIx Reference Run Time Environment</a>
      </nav>

      <div class="wy-nav-content">
        <div class="rst-content">
          <div role="navigation" aria-label="Page navigation">
  <ul class="wy-breadcrumbs">
      <li><a href="../index.html" class="icon icon-home" aria-label="Home"></a></li>
          <li class="breadcrumb-item"><a href="index.html"><span class="section-number">8. </span>Process placement</a></li>
      <li class="breadcrumb-item active"><span class="section-number">8.6. </span>Overloading and Oversubscribing</li>
      <li class="wy-breadcrumbs-aside">
            <a href="../_sources/placement/limits.rst.txt" rel="nofollow"> View page source</a>
      </li>
  </ul>
  <hr/>
</div>
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
             
  <style>
.wy-table-responsive table td,.wy-table-responsive table th{white-space:normal}
</style><div class="section" id="overloading-and-oversubscribing">
<h1><span class="section-number">8.6. </span>Overloading and Oversubscribing<a class="headerlink" href="#overloading-and-oversubscribing" title="Permalink to this heading"></a></h1>
<p>This section explores the difference between the terms “overloading”
and “oversubscribing”. Users are often confused by the difference
between these two scenarios. As such, this section provides a number
of scenarios to help illustrate the differences.</p>
<ul class="simple">
<li><p><code class="docutils literal notranslate"><span class="pre">--map-by</span> <span class="pre">:OVERSUBSCRIBE</span></code> allow more processes on a node than
allocated</p></li>
<li><p><code class="docutils literal notranslate"><span class="pre">--bind-to</span> <span class="pre">&lt;object&gt;:overload-allowed</span></code> allows for binding more than
one process in relation to a CPU</p></li>
</ul>
<p>The important thing to remember with <em>oversubscribing</em> is that it can
be defined separately from the actual number of CPUs on a node. This
allows the mapper to place more or fewer processes per node than
CPUs. By default, PRRTE uses cores to determine slots in the absence
of such information provided in the hostfile or by the resource
manager (except in the case of the <code class="docutils literal notranslate"><span class="pre">--host</span></code> as described in
the section on that command line option.</p>
<p>The important thing to remember with <em>overloading</em> is that it is
defined as binding more processes than CPUs. By default, PRRTE uses
cores as a means of counting the number of CPUs. However, the user can
adjust this. For example when using the <code class="docutils literal notranslate"><span class="pre">:HWTCPUS</span></code> qualifier to the
<code class="docutils literal notranslate"><span class="pre">--map-by</span></code> option PRRTE will use hardware threads as a means of
counting the number of CPUs.</p>
<p>For the following examples consider a node with:</p>
<ul class="simple">
<li><p>2 processor packages,</p></li>
<li><p>10 cores per package, and</p></li>
<li><p>8 hardware threads per core.</p></li>
</ul>
<p>Consider the node from above with the hostfile below:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ cat myhostfile
node01 slots=32
node02 slots=32
</pre></div>
</div>
<p>The <code class="docutils literal notranslate"><span class="pre">slots</span></code> token tells PRRTE that it can place up to 32 processes
before <em>oversubscribing</em> the node.</p>
<p>If we run the following:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">34</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">core</span> <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">core</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>It will return an error at the binding time indicating an
<em>overloading</em> scenario.</p>
<p>The mapping mechanism assigns 32 processes to <code class="docutils literal notranslate"><span class="pre">node01</span></code> matching the
<code class="docutils literal notranslate"><span class="pre">slots</span></code> specification in the hostfile. The binding mechanism will bind
the first 20 processes to unique cores leaving it with 12 processes
that it cannot bind without overloading one of the cores (putting more
than one process on the core).</p>
<p>Using the <code class="docutils literal notranslate"><span class="pre">overload-allowed</span></code> qualifier to the <code class="docutils literal notranslate"><span class="pre">--bind-to</span> <span class="pre">core</span></code>
option tells PRRTE that it may assign more than one process to a core.</p>
<p>If we run the following:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">34</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">core</span> <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">core</span><span class="p">:</span><span class="n">overload</span><span class="o">-</span><span class="n">allowed</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>This will run correctly placing 32 processes on <code class="docutils literal notranslate"><span class="pre">node01</span></code>, and 2
processes on <code class="docutils literal notranslate"><span class="pre">node02</span></code>. On <code class="docutils literal notranslate"><span class="pre">node01</span></code> two processes are bound to
cores 0-11 accounting for the overloading of those cores.</p>
<p>Alternatively, we could use hardware threads to give binding a lower
level CPU to bind to without overloading.</p>
<p>If we run the following:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">34</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">core</span><span class="p">:</span><span class="n">HWTCPUS</span> <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">hwthread</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>This will run correctly placing 32 processes on <code class="docutils literal notranslate"><span class="pre">node01</span></code>, and 2
processes on <code class="docutils literal notranslate"><span class="pre">node02</span></code>. On <code class="docutils literal notranslate"><span class="pre">node01</span></code> two processes are mapped to
cores 0-11 but bound to different hardware threads on those cores (the
logical first and second hardware thread). Thus no hardware threads
are overloaded at binding time.</p>
<p>In both of the examples above the node is not oversubscribed at
mapping time because the hostfile set the oversubscription limit to
<code class="docutils literal notranslate"><span class="pre">slots=32</span></code> for each node. It is only after we exceed that limit that
PRRTE will throw an oversubscription error.</p>
<p>Consider next if we ran the following:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">66</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">core</span><span class="p">:</span><span class="n">HWTCPUS</span> <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">hwthread</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>This will return an error at mapping time indicating an
oversubscription scenario. The mapping mechanism will assign all of
the available slots (64 across 2 nodes) and be left two processes to
map. The only way to map those processes is to exceed the number of
available slots putting the job into an oversubscription scenario.</p>
<p>You can force PRRTE to oversubscribe the nodes by using the
<code class="docutils literal notranslate"><span class="pre">:OVERSUBSCRIBE</span></code> qualifier to the <code class="docutils literal notranslate"><span class="pre">--map-by</span></code> option as seen in the
example below:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">66</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> \
    <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">core</span><span class="p">:</span><span class="n">HWTCPUS</span><span class="p">:</span><span class="n">OVERSUBSCRIBE</span> <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">hwthread</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>This will run correctly placing 34 processes on <code class="docutils literal notranslate"><span class="pre">node01</span></code> and 32 on
<code class="docutils literal notranslate"><span class="pre">node02</span></code>.  Each process is bound to a unique hardware thread.</p>
<div class="section" id="overloading-vs-oversubscription-package-example">
<h2><span class="section-number">8.6.1. </span>Overloading vs. Oversubscription: Package Example<a class="headerlink" href="#overloading-vs-oversubscription-package-example" title="Permalink to this heading"></a></h2>
<p>Let’s extend these examples by considering the package level.
Consider the same node as before, but with the hostfile below:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ cat myhostfile
node01 slots=22
node02 slots=22
</pre></div>
</div>
<p>The lowest level CPUs are “cores” and we have 20 total (10 per
package).</p>
<p>If we run:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">20</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">package</span> \
    <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">package</span><span class="p">:</span><span class="n">REPORT</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>Then 10 processes are mapped to each package, and bound at the package
level.  This is not overloading since we have 10 CPUs (cores)
available in the package at the hardware level.</p>
<p>However, if we run:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">21</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">package</span> \
    <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">package</span><span class="p">:</span><span class="n">REPORT</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>Then 11 processes are mapped to the first package and 10 to the second
package.  At binding time we have an overloading scenario because
there are only 10 CPUs (cores) available in the package at the
hardware level. So the first package is overloaded.</p>
</div>
<div class="section" id="overloading-vs-oversubscription-hardware-threads-example">
<h2><span class="section-number">8.6.2. </span>Overloading vs. Oversubscription: Hardware Threads Example<a class="headerlink" href="#overloading-vs-oversubscription-hardware-threads-example" title="Permalink to this heading"></a></h2>
<p>Similarly, if we consider hardware threads.</p>
<p>Consider the same node as before, but with the hostfile below:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ cat myhostfile
node01 slots=165
node02 slots=165
</pre></div>
</div>
<p>The lowest level CPUs are “hwthreads” (because we are going to use the
<code class="docutils literal notranslate"><span class="pre">:HWTCPUS</span></code> qualifier) and we have 160 total (80 per package).</p>
<p>If we re-run (from the package example) and add the <code class="docutils literal notranslate"><span class="pre">:HWTCPUS</span></code>
qualifier:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">21</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">package</span><span class="p">:</span><span class="n">HWTCPUS</span> \
    <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">package</span><span class="p">:</span><span class="n">REPORT</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>Without the <code class="docutils literal notranslate"><span class="pre">:HWTCPUS</span></code> qualifier this would be overloading (as we
saw previously). The mapper places 11 processes on the first package
and 10 to the second package. The processes are still bound to the
package level. However, with the <code class="docutils literal notranslate"><span class="pre">:HWTCPUS</span></code> qualifier, it is not
overloading since we have 80 CPUs (hwthreads) available in the package
at the hardware level.</p>
<p>Alternatively, if we run:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">prun</span> <span class="o">--</span><span class="n">np</span> <span class="mi">161</span> <span class="o">--</span><span class="n">hostfile</span> <span class="n">myhostfile</span> <span class="o">--</span><span class="nb">map</span><span class="o">-</span><span class="n">by</span> <span class="n">package</span><span class="p">:</span><span class="n">HWTCPUS</span> \
    <span class="o">--</span><span class="n">bind</span><span class="o">-</span><span class="n">to</span> <span class="n">package</span><span class="p">:</span><span class="n">REPORT</span> <span class="n">hostname</span>
</pre></div>
</div>
<p>Then 81 processes are mapped to the first package and 80 to the second
package.  At binding time we have an overloading scenario because
there are only 80 CPUs (hwthreads) available in the package at the
hardware level.  So the first package is overloaded.</p>
</div>
</div>


           </div>
          </div>
          <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
        <a href="fundamentals.html" class="btn btn-neutral float-left" title="8.5. Fundamentals" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
        <a href="diagnostics.html" class="btn btn-neutral float-right" title="8.7. Diagnostics" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
    </div>

  <hr/>

  <div role="contentinfo">
    <p>&#169; Copyright 2003-2025, The PRRTE Community.</p>
  </div>

  Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
    <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
    provided by <a href="https://readthedocs.org">Read the Docs</a>.
   

</footer>
        </div>
      </div>
    </section>
  </div>
  <script>
      jQuery(function () {
          SphinxRtdTheme.Navigation.enable(true);
      });
  </script> 

</body>
</html>