File: optimisation_hints.html

package info (click to toggle)
xmds2 2.2.2%2Bdfsg-3
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 53,384 kB
  • ctags: 7,223
  • sloc: python: 54,076; cpp: 3,929; ansic: 1,463; makefile: 135; sh: 20
file content (326 lines) | stat: -rw-r--r-- 32,789 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Optimisation Hints &mdash; XMDS2 2.2.2 documentation</title>
    
    <link rel="stylesheet" href="_static/default.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    './',
        VERSION:     '2.2.2',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML,http://www.xmds.org/_static/mathjax-use-tex-fonts.js"></script>
    <link rel="shortcut icon" href="_static/xmds_favicon.ico"/>
    <link rel="top" title="XMDS2 2.2.2 documentation" href="index.html" />
    <link rel="next" title="xsil2graphics2" href="xsil2graphics2.html" />
    <link rel="prev" title="Upgrading From XMDS 1.X" href="upgrade.html" /> 
  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="xsil2graphics2.html" title="xsil2graphics2"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="upgrade.html" title="Upgrading From XMDS 1.X"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">XMDS2 2.2.2 documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="optimisation-hints">
<span id="optimisationhints"></span><h1>Optimisation Hints<a class="headerlink" href="#optimisation-hints" title="Permalink to this headline">¶</a></h1>
<p>There are a variety of things you can do to make your simulations run faster.</p>
<div class="section" id="geometry-and-transform-based-tricks">
<h2>Geometry and transform-based tricks<a class="headerlink" href="#geometry-and-transform-based-tricks" title="Permalink to this headline">¶</a></h2>
<div class="section" id="simpler-simulation-geometries">
<h3>Simpler simulation geometries<a class="headerlink" href="#simpler-simulation-geometries" title="Permalink to this headline">¶</a></h3>
<p>Consider symmetry, can you use <tt class="docutils literal"><span class="pre">dct</span></tt> transforms or <tt class="docutils literal"><span class="pre">bessel</span></tt> transforms? Do you really need that many points? How big does your grid need to be? Could absorbing boundary conditions help?</p>
</div>
<div class="section" id="tricks-for-bessel-and-hermite-gauss-transforms">
<h3>Tricks for Bessel and Hermite-Gauss transforms<a class="headerlink" href="#tricks-for-bessel-and-hermite-gauss-transforms" title="Permalink to this headline">¶</a></h3>
<p>Dimensions using matrix transforms should be first for performance reasons.  Unless you&#8217;re using MPI, in which case XMDS can work it out for the first two dimensions.  Ideally, XMDS would sort it out in all cases, but it&#8217;s not that smart yet.</p>
</div>
</div>
<div class="section" id="reduce-code-complexity">
<h2>Reduce code complexity<a class="headerlink" href="#reduce-code-complexity" title="Permalink to this headline">¶</a></h2>
<p>Avoid transcendental functions like <span class="math">\(\sin(x)\)</span> or <span class="math">\(\exp(x)\)</span> in inner loops. Not all operations are made equal, use multiplication over division.</p>
<div class="section" id="optimising-with-the-interaction-picture-ip-operator">
<span id="optimisingipoperators"></span><h3>Optimising with the Interaction Picture (IP) operator<a class="headerlink" href="#optimising-with-the-interaction-picture-ip-operator" title="Permalink to this headline">¶</a></h3>
<p>You should use the IP operator when you can. Only use the EX operator when you have to. If you must use the EX operator, consider making it <tt class="docutils literal"><span class="pre">constant=&quot;no&quot;</span></tt>. It uses less memory.
When you use the IP operator, make sure you know what it&#8217;s doing.  Do not pre- or post-multiply that term in your equations, XMDS will do a fairly thorough check to see you aren&#8217;t using the IP operator improperly, but it is possible to confuse XMDS&#8217;s check.</p>
<p>If your simulation uses two or more dimensions, check to see if your IP operator is separable, i.e. can be written in the form <span class="math">\(f(kx) + g(ky)\)</span> (this is frequently possible in atom-optics simulations). If your IP operator is separable, create separate IP operators for each dimension.  This provides a significant speedup (~30%) for simulations using an adaptive integrator.  For example, instead of using the IP operator:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;operator</span> <span class="na">kind=</span><span class="s">&quot;ip&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;operator_names&gt;</span>T<span class="nt">&lt;/operator_names&gt;</span>
  <span class="cp">&lt;![CDATA[</span>
    <span class="n">T</span> <span class="o">=</span> <span class="o">-</span><span class="kc">i</span><span class="o">*</span><span class="mf">0.5</span><span class="o">*</span><span class="n">hbar</span><span class="o">/</span><span class="n">M</span><span class="o">*</span><span class="p">(</span><span class="n">kx</span><span class="o">*</span><span class="n">kx</span> <span class="o">+</span> <span class="n">ky</span><span class="o">*</span><span class="n">ky</span><span class="p">);</span>
  <span class="cp">]]&gt;</span>
<span class="nt">&lt;/operator&gt;</span>
<span class="cp">&lt;![CDATA[</span>
  <span class="n">dpsi_dt</span> <span class="o">=</span> <span class="n">T</span><span class="p">[</span><span class="n">psi</span><span class="p">]</span> <span class="o">+</span> <span class="cm">/* other terms */</span>
<span class="cp">]]&gt;</span>
</pre></div>
</div>
<p>replace it with the pair of IP operators:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;operator</span> <span class="na">kind=</span><span class="s">&quot;ip&quot;</span> <span class="na">dimensions=</span><span class="s">&quot;x&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;operator_names&gt;</span>Tx<span class="nt">&lt;/operator_names&gt;</span>
  <span class="cp">&lt;![CDATA[</span>
    <span class="n">Tx</span> <span class="o">=</span> <span class="o">-</span><span class="kc">i</span><span class="o">*</span><span class="mf">0.5</span><span class="o">*</span><span class="n">hbar</span><span class="o">/</span><span class="n">M</span><span class="o">*</span><span class="n">kx</span><span class="o">*</span><span class="n">kx</span><span class="p">;</span>
  <span class="cp">]]&gt;</span>
<span class="nt">&lt;/operator&gt;</span>
<span class="nt">&lt;operator</span> <span class="na">kind=</span><span class="s">&quot;ip&quot;</span> <span class="na">dimensions=</span><span class="s">&quot;y&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;operator_names&gt;</span>Ty<span class="nt">&lt;/operator_names&gt;</span>
  <span class="cp">&lt;![CDATA[</span>
    <span class="n">Ty</span> <span class="o">=</span> <span class="o">-</span><span class="kc">i</span><span class="o">*</span><span class="mf">0.5</span><span class="o">*</span><span class="n">hbar</span><span class="o">/</span><span class="n">M</span><span class="o">*</span><span class="n">ky</span><span class="o">*</span><span class="n">ky</span><span class="p">;</span>
  <span class="cp">]]&gt;</span>
<span class="nt">&lt;/operator&gt;</span>
<span class="cp">&lt;![CDATA[</span>
  <span class="n">dpsi_dt</span> <span class="o">=</span> <span class="n">Tx</span><span class="p">[</span><span class="n">psi</span><span class="p">]</span> <span class="o">+</span> <span class="n">Ty</span><span class="p">[</span><span class="n">psi</span><span class="p">]</span> <span class="o">+</span> <span class="cm">/* other terms */</span>
<span class="cp">]]&gt;</span>
</pre></div>
</div>
<p>When using the IP operator, check if your operator is purely real or purely imaginary.  If real, (e.g. <tt class="docutils literal"><span class="pre">L</span> <span class="pre">=</span> <span class="pre">-0.5*kx</span> <span class="pre">*</span> <span class="pre">kx;</span></tt>), then add the attribute <tt class="docutils literal"><span class="pre">type=&quot;real&quot;</span></tt> to the <tt class="docutils literal"><span class="pre">&lt;operator</span> <span class="pre">kind=&quot;ip&quot;&gt;</span></tt> tag.  If purely imaginary, use <tt class="docutils literal"><span class="pre">type=&quot;imaginary&quot;</span></tt>.  This optimisation saves performing the part of the complex exponential that is unnecessary.</p>
</div>
<div class="section" id="consider-writing-the-evolution-in-spectral-basis">
<h3>Consider writing the evolution in spectral basis<a class="headerlink" href="#consider-writing-the-evolution-in-spectral-basis" title="Permalink to this headline">¶</a></h3>
<p>Evolution equations do not need to be written in the position basis.  If your equations are diagonal in the spectral basis, then it makes more sense to compute the time derivative terms in that basis.  For example, if you have the system</p>
<div class="math">
\[\begin{split}\frac{d\psi_1(x)}{dt} &amp;= i \frac{\hbar}{2M} \frac{d^2\psi_1(x)}{dx^2} - i \Omega \psi_2(x)\\
\frac{d\psi_2(x)}{dt} &amp;= i \frac{\hbar}{2M} \frac{d^2\psi_2(x)}{dx^2} - i \Omega \psi_1(x)\end{split}\]</div>
<p>then this is diagonal in the Fourier basis where it takes the form</p>
<div class="math">
\[\begin{split}\frac{d\psi_1(k_x)}{dt} &amp;= -i \frac{\hbar k_x^2}{2M} \psi_1(k_x) - i \Omega \psi_2(k_x)\\
\frac{d\psi_2(k_x)}{dt} &amp;= -i \frac{\hbar k_x^2}{2M} \psi_2(k_x) - i \Omega \psi_1(k_x)\end{split}\]</div>
<p>The first term in each evolution equation can be solved exactly with an IP operator, and the second term is diagonal in Fourier space.  This can be written in XMDS as:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;operators&gt;</span>
  <span class="nt">&lt;integration_vectors</span> <span class="na">basis=</span><span class="s">&quot;kx&quot;</span><span class="nt">&gt;</span>wavefunction<span class="nt">&lt;/integration_vectors&gt;</span>
  <span class="nt">&lt;operator</span> <span class="na">kind=</span><span class="s">&quot;ip&quot;</span> <span class="na">type=</span><span class="s">&quot;imaginary&quot;</span> <span class="nt">&gt;</span>
    <span class="nt">&lt;operator_names&gt;</span>Lxx<span class="nt">&lt;/operator_names&gt;</span>
    <span class="cp">&lt;![CDATA[</span>
      <span class="n">Lxx</span> <span class="o">=</span> <span class="o">-</span><span class="kc">i</span><span class="o">*</span><span class="mf">0.5</span><span class="o">*</span><span class="n">hbar_M</span><span class="o">*</span><span class="p">(</span><span class="n">kx</span><span class="o">*</span><span class="n">kx</span><span class="p">);</span>
    <span class="cp">]]&gt;</span>
  <span class="nt">&lt;/operator&gt;</span>
  <span class="cp">&lt;![CDATA[</span>

    <span class="n">dpsi0_dt</span> <span class="o">=</span> <span class="n">Lxx</span><span class="p">[</span><span class="n">psi0</span><span class="p">]</span> <span class="o">-</span> <span class="kc">i</span><span class="o">*</span><span class="n">Omega</span><span class="o">*</span><span class="n">psi1</span><span class="p">;</span>
    <span class="n">dpsi1_dt</span> <span class="o">=</span> <span class="n">Lxx</span><span class="p">[</span><span class="n">psi1</span><span class="p">]</span> <span class="o">-</span> <span class="kc">i</span><span class="o">*</span><span class="n">Omega</span><span class="o">*</span><span class="n">psi0</span><span class="p">;</span>

  <span class="cp">]]&gt;</span>
<span class="nt">&lt;/operators&gt;</span>
</pre></div>
</div>
<p>Although the <tt class="docutils literal"><span class="pre">dpsi0_dt</span></tt> code reads the same in position and Fourier space, it is the <tt class="docutils literal"><span class="pre">basis=kx</span></tt> attribute on <tt class="docutils literal"><span class="pre">&lt;integration_vectors&gt;</span></tt> that causes the evolution code to be executed in Fourier space.</p>
<p>A final optimisation is to cause the integration code itself to operate in Fourier space.  By default, all time stepping (i.e. <span class="math">\(f(t + \Delta t) = f(t) + f'(t) \Delta t\)</span> for forward-Euler integration) occurs in the position space.  As the derivative terms can be computed in Fourier space, it is faster to also to the time stepping in Fourier space too.  This then means that no Fourier transforms will be needed at all during this integrate block (except as needed by sampling).  To cause time-stepping to happen in Fourier space, we add the <tt class="docutils literal"><span class="pre">home_space=&quot;k&quot;</span></tt> attribute to the surrounding <tt class="docutils literal"><span class="pre">&lt;integrate&gt;</span></tt> block.  By default, <tt class="docutils literal"><span class="pre">home_space</span></tt> has the value <tt class="docutils literal"><span class="pre">&quot;x&quot;</span></tt> which means position space, even if you don&#8217;t have an <tt class="docutils literal"><span class="pre">x</span></tt> dimension.</p>
<p>The fully optimised code then reads:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;integrate</span> <span class="na">algorithm=</span><span class="s">&quot;ARK45&quot;</span> <span class="na">interval=</span><span class="s">&quot;1&quot;</span> <span class="na">tolerance=</span><span class="s">&quot;1e-6&quot;</span> <span class="na">home_space=</span><span class="s">&quot;k&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;samples&gt;</span> 10 <span class="nt">&lt;/samples&gt;</span>
  <span class="nt">&lt;operators&gt;</span>
    <span class="nt">&lt;integration_vectors</span> <span class="na">basis=</span><span class="s">&quot;kx&quot;</span><span class="nt">&gt;</span>wavefunction<span class="nt">&lt;/integration_vectors&gt;</span>
    <span class="nt">&lt;operator</span> <span class="na">kind=</span><span class="s">&quot;ip&quot;</span> <span class="na">type=</span><span class="s">&quot;imaginary&quot;</span> <span class="nt">&gt;</span>
      <span class="nt">&lt;operator_names&gt;</span>Lxx<span class="nt">&lt;/operator_names&gt;</span>
      <span class="cp">&lt;![CDATA[</span>
        <span class="n">Lxx</span> <span class="o">=</span> <span class="o">-</span><span class="kc">i</span><span class="o">*</span><span class="mf">0.5</span><span class="o">*</span><span class="n">hbar_M</span><span class="o">*</span><span class="p">(</span><span class="n">kx</span><span class="o">*</span><span class="n">kx</span><span class="p">);</span>
      <span class="cp">]]&gt;</span>
    <span class="nt">&lt;/operator&gt;</span>
    <span class="cp">&lt;![CDATA[</span>

      <span class="n">dpsi0_dt</span> <span class="o">=</span> <span class="n">Lxx</span><span class="p">[</span><span class="n">psi0</span><span class="p">]</span> <span class="o">-</span> <span class="kc">i</span><span class="o">*</span><span class="n">Omega</span><span class="o">*</span><span class="n">psi1</span><span class="p">;</span>
      <span class="n">dpsi1_dt</span> <span class="o">=</span> <span class="n">Lxx</span><span class="p">[</span><span class="n">psi1</span><span class="p">]</span> <span class="o">-</span> <span class="kc">i</span><span class="o">*</span><span class="n">Omega</span><span class="o">*</span><span class="n">psi0</span><span class="p">;</span>

    <span class="cp">]]&gt;</span>
  <span class="nt">&lt;/operators&gt;</span>
<span class="nt">&lt;/integrate&gt;</span>
</pre></div>
</div>
<p>This code will not use any Fourier transforms during an ordinary time-stepping, and will be much faster than if the code were written without the <tt class="docutils literal"><span class="pre">home_space</span></tt> and <tt class="docutils literal"><span class="pre">basis</span></tt> attributes.</p>
</div>
<div class="section" id="don-t-recalculate-things-you-don-t-have-to">
<h3>Don&#8217;t recalculate things you don&#8217;t have to<a class="headerlink" href="#don-t-recalculate-things-you-don-t-have-to" title="Permalink to this headline">¶</a></h3>
<p>Use <tt class="docutils literal"><span class="pre">computed_vectors</span></tt> appropriately.</p>
</div>
</div>
<div class="section" id="compiler-and-library-tricks">
<h2>Compiler and library tricks<a class="headerlink" href="#compiler-and-library-tricks" title="Permalink to this headline">¶</a></h2>
<div class="section" id="faster-compiler">
<h3>Faster compiler<a class="headerlink" href="#faster-compiler" title="Permalink to this headline">¶</a></h3>
<p>If you&#8217;re using an Intel CPU, then you should consider using their compiler, icc. They made the silicon, and they also made a compiler that understands how their chips work significantly better than the more-portable GCC.</p>
</div>
<div class="section" id="faster-libraries">
<h3>Faster libraries<a class="headerlink" href="#faster-libraries" title="Permalink to this headline">¶</a></h3>
<p>Intel MKL is faster than ATLAS, which is faster than GSL CBLAS. If you have a Mac, then Apple&#8217;s vecLib is plenty fast.</p>
<p><strong>Note for Linux users</strong></p>
<p>If you used the linux installer, and are using Ubuntu or Debian, the version of the ATLAS package that gets installed is the generic version in the repositories. This version lacks architecture and CPU-specific optimizations.</p>
<p>Creating an ATLAS package locally tuned for your specific machine will result in a faster linear algebra implementation, which can significantly speed up problems utilizing matrix based transforms (bessel, hermite-gauss etc). Some simple tests using a cylindrically symmetric problem with one bessel transform dimension and one FFT transform dimension showed speed increases from 5% - 100% over the default ATLAS package, depending on the number of grid points.</p>
<p>To create and install an ATLAS package optimized for your machine, carry out the following procedure:</p>
<p>Using your favourite package manager (e.g. Synaptic) to remove any current ATLAS libraries (probably libatlas3gf-base, libatlas-dev, libatlas-base-dev). Then create an empty directory whose path doesn&#8217;t include any spaces. In this directory, do</p>
<div class="highlight-xpdeint"><div class="highlight"><pre>apt-get source atlas
apt-get build-dep atlas
apt-get install devscripts dpkg-dev

cd atlas-*
sudo fakeroot debian/rules custom
cd ..
ls libatlas*.deb
</pre></div>
</div>
<p>Then, for each of the .deb packages listed by the ls command, install via:</p>
<div class="highlight-xpdeint"><pre>sudo dpkg -i &lt;filename here&gt;.deb</pre>
</div>
<p>This procedure was tested on Ubuntu 12.04 LTS, but an identical or very similar procedure should work for other Ubuntu/Debian versions.</p>
<p>Finally, note that the &#8220;sudo fakeroot debian/rules custom&#8221; package creation step carries out an exhaustive series of tests to optimize for your architecture, SSE support, cache hierarchy and so on, and can take a long time. Be patient.</p>
</div>
<div class="section" id="auto-vectorisation">
<h3>Auto-vectorisation<a class="headerlink" href="#auto-vectorisation" title="Permalink to this headline">¶</a></h3>
<p>Auto-vectorisation is a compiler feature that makes compilers generate more efficient code that can execute the same operation on multiple pieces of data simultaneously. To use this feature, you need to add the following to the <tt class="docutils literal"><span class="pre">&lt;features&gt;</span></tt> block at the start of your simulation:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;auto_vectorise</span> <span class="nt">/&gt;</span>
</pre></div>
</div>
<p>This will make xpdeint generate code that is more friendly to compiler&#8217;s auto-vectorisation features so that more code can be vectorised. It will also add the appropriate compiler options to turn on your compiler&#8217;s auto-vectorisation features. For auto-vectorisation to increase the speed of your simulations, you will need a compiler that supports it such as gcc 4.2 or later, or Intel&#8217;s C compiler, <tt class="docutils literal"><span class="pre">icc</span></tt>.</p>
</div>
<div class="section" id="openmp">
<h3>OpenMP<a class="headerlink" href="#openmp" title="Permalink to this headline">¶</a></h3>
<p><a class="reference external" href="http://openmp.org">OpenMP</a> is a set of compiler directives to make it easier to use threads (different execution contexts) in programs. Using threads in your simulation does occur some overhead, so for the speedup to outweigh the overhead, you must have a reasonably large simulation grid. To add these compiler directives to the generated simulations, add the tag <tt class="docutils literal"><span class="pre">&lt;openmp</span> <span class="pre">/&gt;</span></tt> in the <tt class="docutils literal"><span class="pre">&lt;features&gt;</span></tt> block. This can be used in combination with the auto-vectorisation feature above. Note that if you are using gcc, make sure you check that your simulations are faster by using this as gcc&#8217;s OpenMP implementation isn&#8217;t as good as icc&#8217;s.</p>
<p>If you are using the OpenMP feature and are using <a class="reference external" href="http://www.fftw.org">FFTW</a>-based transforms (Discrete Fourier/Cosine/Sine Transforms), you should consider using threads with your FFT&#8217;s by adding the following to the <tt class="docutils literal"><span class="pre">&lt;features&gt;</span></tt> block at the start of your simulation:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;fftw</span> <span class="na">threads=</span><span class="s">&quot;2&quot;</span> <span class="nt">/&gt;</span>
</pre></div>
</div>
<p>Replace the number of threads in the above code by the number of threads that you want to use.</p>
</div>
<div class="section" id="parallelisation-with-mpi">
<h3>Parallelisation with MPI<a class="headerlink" href="#parallelisation-with-mpi" title="Permalink to this headline">¶</a></h3>
<p>Some simulations are so large or take so much time that it is not reasonable to run them on a single CPU on a single machine. Fortunately, the <a class="reference external" href="http://www.mpi-forum.org/">Message Passing Interface</a> was developed to enable different computers working on the same program to exchange data. You will need a MPI package installed to be abel to use this feature with your simulations. One popular implementation of MPI is <a class="reference external" href="http://www.open-mpi.org">OpenMPI</a>.</p>
</div>
</div>
<div class="section" id="atom-optics-specific-hints">
<h2>Atom-optics-specific hints<a class="headerlink" href="#atom-optics-specific-hints" title="Permalink to this headline">¶</a></h2>
<div class="section" id="separate-out-imaginary-time-calculation-code">
<h3>Separate out imaginary-time calculation code<a class="headerlink" href="#separate-out-imaginary-time-calculation-code" title="Permalink to this headline">¶</a></h3>
<p>When doing simulations that require the calculation of the groundstate (typically via the imaginary time algorithm), typically the groundstate itself does not need to be changed frequently as it is usually the dynamics of the simulation that have the interesting physics. In this case, you can save having to re-calculate groundstate every time by having one script (call it <tt class="docutils literal"><span class="pre">groundstate.xmds</span></tt>) that saves the calculated groundstate to a file using a breakpoint, and a second simulation that loads this calculated groundstate and then performs the evolution. More often than not, you won&#8217;t need to re-run the groundstate finder.</p>
<p>The file format used in this example is <a class="reference external" href="http://www.hdfgroup.org/HDF5/">HDF5</a>, and you will need the HDF5 libraries installed to use this example. The alternative is to use the deprecated <tt class="docutils literal"><span class="pre">binary</span></tt> format, however to load <tt class="docutils literal"><span class="pre">binary</span></tt> format data <tt class="docutils literal"><span class="pre">xmds</span></tt>, the predecessor to <tt class="docutils literal"><span class="pre">xpdeint</span></tt> must be installed. Anyone who has done this before will tell you that installing it isn&#8217;t a pleasant experience, and so HDF5 is the recommended file format.</p>
<p>If your wavefunction vector is called <tt class="docutils literal"><span class="pre">'wavefunction'</span></tt>, then to save the groundstate to the file <tt class="docutils literal"><span class="pre">groundstate_break.h5</span></tt> in the HDF5 format, put the following code immediately after the integrate block that calculates your groundstate:</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;breakpoint</span> <span class="na">filename=</span><span class="s">&quot;groundstate_break&quot;</span> <span class="na">format=</span><span class="s">&quot;hdf5&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;dependencies&gt;</span>wavefunction<span class="nt">&lt;/dependencies&gt;</span>
<span class="nt">&lt;/breakpoint&gt;</span>
</pre></div>
</div>
<p>In addition to the <tt class="docutils literal"><span class="pre">groundstate_break.h5</span></tt> file, an XSIL wrapper <tt class="docutils literal"><span class="pre">groundstate_break.xsil</span></tt> will also be created for use with <a class="reference internal" href="xsil2graphics2.html#xsil2graphics2"><em>xsil2graphics2</em></a>.</p>
<p>To load this groundstate into your evolution script, the declaration of your <tt class="docutils literal"><span class="pre">'wavefunction'</span></tt> vector in your evolution script should look something like</p>
<div class="highlight-xpdeint"><div class="highlight"><pre><span class="nt">&lt;vector</span> <span class="na">name=</span><span class="s">&quot;wavefunction&quot;</span><span class="nt">&gt;</span>
  <span class="nt">&lt;components&gt;</span>phi1 phi2<span class="nt">&lt;/components&gt;</span>
  <span class="nt">&lt;initialisation</span> <span class="na">kind=</span><span class="s">&quot;hdf5&quot;</span><span class="nt">&gt;</span>
    <span class="nt">&lt;filename&gt;</span>groundstate_break.h5<span class="nt">&lt;/filename&gt;</span>
  <span class="nt">&lt;/initialisation&gt;</span>
<span class="nt">&lt;/vector&gt;</span>
</pre></div>
</div>
<p>Note that the groundstate-finder doesn&#8217;t need to have all of the components that the evolution script needs. For example, if you are considering the evolution of a two-component BEC where only one component has a population in the groundstate, then your groundstate script can contain only the <tt class="docutils literal"><span class="pre">phi1</span></tt> component, while your evolution script can contain both the <tt class="docutils literal"><span class="pre">phi1</span></tt> component and the <tt class="docutils literal"><span class="pre">phi2</span></tt> component. Note that the geometry of the script generating the groundstate and the evolution script must be the same.</p>
</div>
<div class="section" id="use-an-energy-or-momentum-offset">
<h3>Use an energy or momentum offset<a class="headerlink" href="#use-an-energy-or-momentum-offset" title="Permalink to this headline">¶</a></h3>
<p>This is just the interaction picture with a constant term in the Hamiltonian. If your state is going to rotate like <span class="math">\(e^{i(\omega + \delta\omega)t}\)</span>, then transform your equations to remove the <span class="math">\(e^{i \omega t}\)</span> term. Likewise for spatial rotations, if one mode will be moving on average with momentum <span class="math">\(\hbar k\)</span>, then transform your equations to remove that term. This way, you may be able to reduce the density of points you need in that dimension. Warning: don&#8217;t forget to consider this when looking at your results. I (Graham Dennis) have been tripped up on multiple occasions when making this optimisation.</p>
</div>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper">
            <p class="logo"><a href="index.html">
              <img class="logo" src="_static/xmds_logo.png" alt="Logo"/>
            </a></p>
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Optimisation Hints</a><ul>
<li><a class="reference internal" href="#geometry-and-transform-based-tricks">Geometry and transform-based tricks</a><ul>
<li><a class="reference internal" href="#simpler-simulation-geometries">Simpler simulation geometries</a></li>
<li><a class="reference internal" href="#tricks-for-bessel-and-hermite-gauss-transforms">Tricks for Bessel and Hermite-Gauss transforms</a></li>
</ul>
</li>
<li><a class="reference internal" href="#reduce-code-complexity">Reduce code complexity</a><ul>
<li><a class="reference internal" href="#optimising-with-the-interaction-picture-ip-operator">Optimising with the Interaction Picture (IP) operator</a></li>
<li><a class="reference internal" href="#consider-writing-the-evolution-in-spectral-basis">Consider writing the evolution in spectral basis</a></li>
<li><a class="reference internal" href="#don-t-recalculate-things-you-don-t-have-to">Don&#8217;t recalculate things you don&#8217;t have to</a></li>
</ul>
</li>
<li><a class="reference internal" href="#compiler-and-library-tricks">Compiler and library tricks</a><ul>
<li><a class="reference internal" href="#faster-compiler">Faster compiler</a></li>
<li><a class="reference internal" href="#faster-libraries">Faster libraries</a></li>
<li><a class="reference internal" href="#auto-vectorisation">Auto-vectorisation</a></li>
<li><a class="reference internal" href="#openmp">OpenMP</a></li>
<li><a class="reference internal" href="#parallelisation-with-mpi">Parallelisation with MPI</a></li>
</ul>
</li>
<li><a class="reference internal" href="#atom-optics-specific-hints">Atom-optics-specific hints</a><ul>
<li><a class="reference internal" href="#separate-out-imaginary-time-calculation-code">Separate out imaginary-time calculation code</a></li>
<li><a class="reference internal" href="#use-an-energy-or-momentum-offset">Use an energy or momentum offset</a></li>
</ul>
</li>
</ul>
</li>
</ul>

<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             >index</a></li>
        <li class="right" >
          <a href="xsil2graphics2.html" title="xsil2graphics2"
             >next</a> |</li>
        <li class="right" >
          <a href="upgrade.html" title="Upgrading From XMDS 1.X"
             >previous</a> |</li>
        <li><a href="index.html">XMDS2 2.2.2 documentation</a> &raquo;</li> 
      </ul>
    </div>
    <div class="footer">
        &copy; Copyright 2008-2014, Graham Dennis, Joe Hope and Mattias Johnsson. Licensed under the GNU FDL.
      Last updated on Oct 14, 2014.
      Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.2b1.
    </div>
  </body>
</html>