| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 
 | 
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Benchmarking tips — LLVM 13 documentation</title>
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="_static/llvm-theme.css" type="text/css" />
    <script id="documentation_options" data-url_root="./" src="_static/documentation_options.js"></script>
    <script src="_static/jquery.js"></script>
    <script src="_static/underscore.js"></script>
    <script src="_static/doctools.js"></script>
    <link rel="index" title="Index" href="genindex.html" />
    <link rel="search" title="Search" href="search.html" />
    <link rel="next" title="Using ARM NEON instructions in big endian mode" href="BigEndianNEON.html" />
    <link rel="prev" title="DWARF Extensions For Heterogeneous Debugging" href="AMDGPUDwarfExtensionsForHeterogeneousDebugging.html" />
<style type="text/css">
  table.right { float: right; margin-left: 20px; }
  table.right td { border: 1px solid #ccc; }
</style>
  </head><body>
<div class="logo">
  <a href="index.html">
    <img src="_static/logo.png"
         alt="LLVM Logo" width="250" height="88"/></a>
</div>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="BigEndianNEON.html" title="Using ARM NEON instructions in big endian mode"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="AMDGPUDwarfExtensionsForHeterogeneousDebugging.html" title="DWARF Extensions For Heterogeneous Debugging"
             accesskey="P">previous</a> |</li>
  <li><a href="https://llvm.org/">LLVM Home</a> | </li>
  <li><a href="index.html">Documentation</a>»</li>
          <li class="nav-item nav-item-1"><a href="UserGuides.html" accesskey="U">User Guides</a> »</li>
        <li class="nav-item nav-item-this"><a href="">Benchmarking tips</a></li> 
      </ul>
    </div>
      <div class="sphinxsidebar" role="navigation" aria-label="main navigation">
        <div class="sphinxsidebarwrapper">
<h3>Documentation</h3>
<ul class="want-points">
    <li><a href="https://llvm.org/docs/GettingStartedTutorials.html">Getting Started/Tutorials</a></li>
    <li><a href="https://llvm.org/docs/UserGuides.html">User Guides</a></li>
    <li><a href="https://llvm.org/docs/Reference.html">Reference</a></li>
</ul>
<h3>Getting Involved</h3>
<ul class="want-points">
    <li><a href="https://llvm.org/docs/Contributing.html">Contributing to LLVM</a></li>
    <li><a href="https://llvm.org/docs/HowToSubmitABug.html">Submitting Bug Reports</a></li>
    <li><a href="https://llvm.org/docs/GettingInvolved.html#mailing-lists">Mailing Lists</a></li>
    <li><a href="https://llvm.org/docs/GettingInvolved.html#irc">IRC</a></li>
    <li><a href="https://llvm.org/docs/GettingInvolved.html#meetups-and-social-events">Meetups and Social Events</a></li>
</ul>
<h3>Additional Links</h3>
<ul class="want-points">
    <li><a href="https://llvm.org/docs/FAQ.html">FAQ</a></li>
    <li><a href="https://llvm.org/docs/Lexicon.html">Glossary</a></li>
    <li><a href="https://llvm.org/pubs">Publications</a></li>
    <li><a href="https://github.com/llvm/llvm-project//">Github Repository</a></li>
</ul>
  <div role="note" aria-label="source link">
    <h3>This Page</h3>
    <ul class="this-page-menu">
      <li><a href="_sources/Benchmarking.rst.txt"
            rel="nofollow">Show Source</a></li>
    </ul>
   </div>
<div id="searchbox" style="display: none" role="search">
  <h3 id="searchlabel">Quick search</h3>
    <div class="searchformwrapper">
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" aria-labelledby="searchlabel" />
      <input type="submit" value="Go" />
    </form>
    </div>
</div>
<script>$('#searchbox').show(0);</script>
        </div>
      </div>
    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body" role="main">
            
  <div class="section" id="benchmarking-tips">
<h1>Benchmarking tips<a class="headerlink" href="#benchmarking-tips" title="Permalink to this headline">¶</a></h1>
<div class="section" id="introduction">
<h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
<p>For benchmarking a patch we want to reduce all possible sources of
noise as much as possible. How to do that is very OS dependent.</p>
<p>Note that low noise is required, but not sufficient. It does not
exclude measurement bias. See
<a class="reference external" href="https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf">https://www.cis.upenn.edu/~cis501/papers/producing-wrong-data.pdf</a> for
example.</p>
</div>
<div class="section" id="general">
<h2>General<a class="headerlink" href="#general" title="Permalink to this headline">¶</a></h2>
<ul>
<li><p>Use a high resolution timer, e.g. perf under linux.</p></li>
<li><p>Run the benchmark multiple times to be able to recognize noise.</p></li>
<li><p>Disable as many processes or services as possible on the target system.</p></li>
<li><p>Disable frequency scaling, turbo boost and address space
randomization (see OS specific section).</p></li>
<li><p>Static link if the OS supports it. That avoids any variation that
might be introduced by loading dynamic libraries. This can be done
by passing <code class="docutils literal notranslate"><span class="pre">-DLLVM_BUILD_STATIC=ON</span></code> to cmake.</p></li>
<li><p>Try to avoid storage. On some systems you can use tmpfs. Putting the
program, inputs and outputs on tmpfs avoids touching a real storage
system, which can have a pretty big variability.</p>
<p>To mount it (on linux and freebsd at least):</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">mount</span> <span class="o">-</span><span class="n">t</span> <span class="n">tmpfs</span> <span class="o">-</span><span class="n">o</span> <span class="n">size</span><span class="o">=<</span><span class="n">XX</span><span class="o">></span><span class="n">g</span> <span class="n">none</span> <span class="n">dir_to_mount</span>
</pre></div>
</div>
</li>
</ul>
</div>
<div class="section" id="linux">
<h2>Linux<a class="headerlink" href="#linux" title="Permalink to this headline">¶</a></h2>
<ul>
<li><p>Disable address space randomization:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">echo</span> <span class="mi">0</span> <span class="o">></span> <span class="o">/</span><span class="n">proc</span><span class="o">/</span><span class="n">sys</span><span class="o">/</span><span class="n">kernel</span><span class="o">/</span><span class="n">randomize_va_space</span>
</pre></div>
</div>
</li>
<li><p>Set scaling_governor to performance:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="o">/</span><span class="n">sys</span><span class="o">/</span><span class="n">devices</span><span class="o">/</span><span class="n">system</span><span class="o">/</span><span class="n">cpu</span><span class="o">/</span><span class="n">cpu</span><span class="o">*/</span><span class="n">cpufreq</span><span class="o">/</span><span class="n">scaling_governor</span>
<span class="n">do</span>
  <span class="n">echo</span> <span class="n">performance</span> <span class="o">></span> <span class="o">/</span><span class="n">sys</span><span class="o">/</span><span class="n">devices</span><span class="o">/</span><span class="n">system</span><span class="o">/</span><span class="n">cpu</span><span class="o">/</span><span class="n">cpu</span><span class="o">*/</span><span class="n">cpufreq</span><span class="o">/</span><span class="n">scaling_governor</span>
<span class="n">done</span>
</pre></div>
</div>
</li>
<li><p>Use <a class="reference external" href="https://github.com/lpechacek/cpuset">https://github.com/lpechacek/cpuset</a> to reserve cpus for just the
program you are benchmarking. If using perf, leave at least 2 cores
so that perf runs in one and your program in another:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">cset</span> <span class="n">shield</span> <span class="o">-</span><span class="n">c</span> <span class="n">N1</span><span class="p">,</span><span class="n">N2</span> <span class="o">-</span><span class="n">k</span> <span class="n">on</span>
</pre></div>
</div>
<p>This will move all threads out of N1 and N2. The <code class="docutils literal notranslate"><span class="pre">-k</span> <span class="pre">on</span></code> means
that even kernel threads are moved out.</p>
</li>
<li><p>Disable the SMT pair of the cpus you will use for the benchmark. The
pair of cpu N can be found in
<code class="docutils literal notranslate"><span class="pre">/sys/devices/system/cpu/cpuN/topology/thread_siblings_list</span></code> and
disabled with:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">echo</span> <span class="mi">0</span> <span class="o">></span> <span class="o">/</span><span class="n">sys</span><span class="o">/</span><span class="n">devices</span><span class="o">/</span><span class="n">system</span><span class="o">/</span><span class="n">cpu</span><span class="o">/</span><span class="n">cpuX</span><span class="o">/</span><span class="n">online</span>
</pre></div>
</div>
</li>
<li><p>Run the program with:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">cset</span> <span class="n">shield</span> <span class="o">--</span><span class="n">exec</span> <span class="o">--</span> <span class="n">perf</span> <span class="n">stat</span> <span class="o">-</span><span class="n">r</span> <span class="mi">10</span> <span class="o"><</span><span class="n">cmd</span><span class="o">></span>
</pre></div>
</div>
<p>This will run the command after <code class="docutils literal notranslate"><span class="pre">--</span></code> in the isolated cpus. The
particular perf command runs the <code class="docutils literal notranslate"><span class="pre"><cmd></span></code> 10 times and reports
statistics.</p>
</li>
</ul>
<p>With these in place you can expect perf variations of less than 0.1%.</p>
<div class="section" id="linux-intel">
<h3>Linux Intel<a class="headerlink" href="#linux-intel" title="Permalink to this headline">¶</a></h3>
<ul>
<li><p>Disable turbo mode:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">echo</span> <span class="mi">1</span> <span class="o">></span> <span class="o">/</span><span class="n">sys</span><span class="o">/</span><span class="n">devices</span><span class="o">/</span><span class="n">system</span><span class="o">/</span><span class="n">cpu</span><span class="o">/</span><span class="n">intel_pstate</span><span class="o">/</span><span class="n">no_turbo</span>
</pre></div>
</div>
</li>
</ul>
</div>
</div>
</div>
            <div class="clearer"></div>
          </div>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related" role="navigation" aria-label="related navigation">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="BigEndianNEON.html" title="Using ARM NEON instructions in big endian mode"
             >next</a> |</li>
        <li class="right" >
          <a href="AMDGPUDwarfExtensionsForHeterogeneousDebugging.html" title="DWARF Extensions For Heterogeneous Debugging"
             >previous</a> |</li>
  <li><a href="https://llvm.org/">LLVM Home</a> | </li>
  <li><a href="index.html">Documentation</a>»</li>
          <li class="nav-item nav-item-1"><a href="UserGuides.html" >User Guides</a> »</li>
        <li class="nav-item nav-item-this"><a href="">Benchmarking tips</a></li> 
      </ul>
    </div>
    <div class="footer" role="contentinfo">
        © Copyright 2003-2021, LLVM Project.
      Last updated on 2021-09-18.
      Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.5.4.
    </div>
  </body>
</html>
 |