File: extraFeatures.tex

package info (click to toggle)
xmds-doc 0~svn.1884-3.1
links: PTS, VCS
area: main
in suites: jessie, jessie-kfreebsd, wheezy
size: 8,336 kB
ctags: 192
sloc: makefile: 135; python: 55
file content (916 lines) | stat: -rw-r--r-- 47,878 bytes
parent folder | download | duplicates (3)
% $Id: extraFeatures.tex 1699 2008-01-29 04:41:15Z gmcmanus $

% Copyright (C) 2000-2007
%
% Code contributed by Greg Collecutt, Joseph Hope and the xmds-devel team
%
% This file is part of xmds.
%
% This program is free software; you can redistribute it and/or
% modify it under the terms of the GNU General Public License
% as published by the Free Software Foundation; either version 2
% of the License, or (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program; if not, write to the Free Software 
% Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.

\chapter{Extra and advanced features}
\label{chap:extraAndAdvancedFeatures}

\section{Error checking}

The error checking feature of \xmds is enabled by default and is
controlled by using the \xmdsTag{error\_check} tag.  This is a boolean
tag, and expects either a \ttt{yes} or \ttt{no} entry.  In the context
of \xmds, error checking means to run the simulation twice: once
through at the full step defined in the simulation script (via the
\xmdsTag{lattice} and \xmdsTag{interval} assignments); and then again
at half of the full step size.  The maximum difference between the
field values in each moment group is reported, and gives an indication
of the discretisation error in the simulation.
If one of the adaptive algorithms is chosen, error checking means that the simulation is run a second time with one 16th of the specified tolerance.

For instance, setting \xmdsTag{error\_check} to \ttt{yes} in the
\ttt{atomlaser} simulation (this is an example code in the
\ttt{examples} directory of the \xmds distribution) we get the
following output:
\begin{alltt}
Making forward plan
Making backward plan
Beginning full step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
Beginning half step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
maximum step error in moment group 1 was 4.408207e-11
\end{alltt}
The error reported here is of the order of $10^{-11}$ and therefore we
can be confident that discretisation error is not a significant
problem in the simulation output.  Once you are sure that your
simulation is behaving nicely, you can then turn off error checking by
setting \xmdsTag{error\_check} to \ttt{no} thereby speeding up the
simulation.  This is especially important for those whose simulations
are going to take a \emph{long} time to run.  So, test your simulation
to make sure that the error is low for a short simulation run, and then
for the main run turn error checking off.

\section{MPI: automatic parallelisation of simulations}

One of the most powerful features of \xmds is its ability to
automatically parallelise simulations.  We go through an extended
discussion of this with specific focus on stochastic simulations in
\Chap{chap:stochasticSimsAndMPI}, but it is of worth mention here as
well.  Not only can \xmds parallelise stochastic simulations by
running each stochastic path on a separate computer, but it is also
able to parallelise the computation of deterministic problems as well.
It does this with the help of a package known as MPI, which stands for
the Message Passing Interface, and is a means to organise the
communication and computation associated with parallel simulations.
To parallelise your simulation, all you have to do is add \emph{one
line of code!}  Honestly.  We're not joking.  To turn this feature on
in your code you need to add the line:
\begin{xmdsCode}
<use_mpi> yes </use_mpi>
\end{xmdsCode}
and that's it.  What \xmds will do when running your simulation is
split up the computation of the field and pass these parts of the
overall computation to different computers, where it is solved faster
than possible by doing so on a single machine.  One good reason for
splitting a simulation like this up and processing each part of the
field on different processors is because for some very large
simulations the memory requirements are too large, and therefore won't
fit on one computer: using MPI is the only way to solve the problem.

There is one major caveat here however: \emph{ONLY} use MPI for
deterministic problems on a supercomputer, or a cluster setup where
there is very small network latency (it doesn't take long for
computers to talk to one another).  This is very important, because
the Fourier transforms require a lot of communication, and if the
network between nodes of the cluster is slow then this will reduce the
speed of the computation significantly, probably making it faster to
run on a single cpu.  Nevertheless, if you do have access to powerful
computing facilities, then by all means, use this feature.

For stochastic problems, there is a second option you may wish to use, 
which changes the way the different paths are allocated between processors.
This can be altered by using the \xmdsTag{MPI\_Method} tag, which can
take the values "Scheduling" or "Uniform".


\section{Benchmarking}

To get an idea as to how long your code is going to run when you scale
the various simulation parameters up for a long simulation, or just to
see how long the main body of code takes, you can use the
\xmdsTag{benchmark} tag.  This tag is a boolean which by default is
\ttt{no}, but when set to \ttt{yes} it tells \xmds to insert timing
code around the main code block, excluding the fftw plan creation and
deletion steps (this is because these steps are not in general
indicative of how long the simulation will run, especially when scaled
up to long simulation times).  To use this feature just put the code
\begin{xmdsCode}
<benchmark> yes </benchmark>
\end{xmdsCode}
into your simulation script in the global simulation and functionality
section, namely before the \xmdsTag{globals} tag.  

The simulation will then report at the end of its run how long it
took.  An example of this is (again, using the \ttt{atomlaser}
simulation):
\begin{alltt}
Making forward plan
Making backward plan
Beginning full step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
Beginning half step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
maximum step error in moment group 1 was 4.408207e-11
Time elapsed for simulation is: 10 seconds
\end{alltt}
where the simulation has taken (approximately) 10 seconds to
complete.  

\section{Wisdom}

The \xmdsTag{use\_wisdom} tag is the way to enable FFTW's wisdom
feature.  This tag expects a boolean argument, and by default is set
to \ttt{no}.  However, when set to \ttt{yes} you can expect an immense
increase in the startup speed of your simulations.

Wisdom is the name fftw gives to stored information about their
Fourier transform plans.  What fftw does before it decides to use a
particular method for calculating the Fourier transform is to run some
calculations beforehand to see which of the methods is the fastest
(this can be related to your system's architecture, the size of the
problem, etc.) and then it can optionally store this information so
that fftw doesn't have to go through all of the hard work again, and
therefore make use of the stored ``wisdom'' about the problem at hand.
Enabling wisdom means that subsequent runs of the simulation will
start up and run (overall) much faster.

\xmds requires a place to save this accumulated wisdom so that it can
be reloaded in subsequent simulation runs.  The way \xmds does this is
to save the widsom in a file called \ttt{<hostname>.wisdom}, where
\ttt{<hostname>} is the name of the computer you are running the
simulation on.  Note that for simulations using MPI, that the
\ttt{.wisdom} filename uses the format \ttt{<hostname><rank>.wisdom}
where the \ttt{<rank>} is the MPI process rank number, and stops name
conflicts when doing parallel simulations.  There are two places that
\xmds can store \ttt{.wisdom} files: in the user's
\ttt{\~{}/.xmds/wisdom} directory; or in the directory local to the
simulation.  The former is used if the \ttt{\~{}/.xmds/wisdom}
directory exists, and the latter is used if not.

Running the \ttt{atomlaser} simulation with wisdom turned on, we get
the following output (for the first run):
\begin{alltt}
Performing fftw calculations
Making forward plan
Making backward plan
Keeping accumulated wisdom
Finished fftw calculations
Beginning full step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
Beginning half step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
maximum step error in moment group 1 was 3.802825e-11
Time elapsed for simulation is: 11 seconds
\end{alltt}
and then for the second run:
\begin{alltt}
Performing fftw calculations
Standing upon the shoulders of giants... (Importing wisdom)
Making forward plan
Making backward plan
Keeping accumulated wisdom
Finished fftw calculations
Beginning full step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
Beginning half step integration ...
Sampled field (for moment group #1) at t        = 0.000000e+00
Sampled field (for moment group #1) at t        = 2.500000e-09
Sampled field (for moment group #1) at t        = 5.000000e-09
Sampled field (for moment group #1) at t        = 7.500000e-09
Sampled field (for moment group #1) at t        = 1.000000e-08
maximum step error in moment group 1 was 3.802825e-11
Time elapsed for simulation is: 10 seconds
\end{alltt}
You will note, if you have run the \ttt{atomlaser} simulation both
with and without wisdom how quickly the simulation starts once some
fftw wisdom is used.  Also, the simulation tells you that it is using
previously generated wisdom, and that it is saving it for future use.

\section{Binary output}

When performing big simulations, i.e.~over many dimensions or when
propagating for a large distance over the propagation dimension, one
is going to produce \emph{very} large output files.  This can be a
problem, and the problem will be exacerbated by the fact that by
default, \xmds outputs data in ascii format, with a lot of redundancy.
As a way to reduce the size of the output, \xmds since \ttt{xmds-1.2}
has had the ability to generate binary output files, which are
inherently smaller (and can have better precision) than ascii data
files, but also deals away with the redundancy introduced in the way
that the ascii data is stored.

In \ttt{xmds-1.2} binary output was controlled by the
\xmdsTag{binary\_output} and \xmdsTag{use\_double} tags.  This syntax
is deprecated as of \ttt{xmds-1.3} in favour of passing attributes to
the \xmdsTag{output} element, and this is the syntax we'll be
discussing here.  Users who are still using \ttt{xmds-1.2} are advised
either to upgrade to a more recent version of \xmds to make use of the
better syntax.  If you still wish to use \ttt{xmds-1.2}, then the
syntax for using binary output is described in
\Chap{chap:languageRef}.

\subsection{The \ttt{format} attribute}

As mentioned above, the output format of an \xmds simulation is now
controlled by the \ttt{format} attribute of the \xmdsTag{output}
element.  The syntax is as follows:
\begin{xmdsCode}
<output format="ascii"|"binary">
  <!-- other xmds tags -->
</output>
\end{xmdsCode}
where by saying \ttt{"ascii"|"binary"} we mean that the format option
is either the string \ttt{"ascii"} or \ttt{"binary"} (the double
quotes are necessary) and that the default option is \ttt{"ascii"}.

For those who have used \xmds in the past, you may remember that all
of the data is output into the \ttt{.xsil} file.  Binary output
doesn't stop the generation of the \ttt{.xsil} file, but merely uses a
feature of the XSIL format that enables binary files to be pointed to
by the \ttt{.xsil} file.  Therefore, all of the important parameters
of the simulation are still saved to the \ttt{.xsil} file, just the
data is now saved to another file (or files if you have more than one
moment group) containing just a binary string of data.  So, when using
binary output the following files will be produced: a \ttt{.xsil} file
containing simulation parameters and pointing to the output data (by
default, this will be called \ttt{<simulation name>.xsil}); and
a binary data file for each moment group, being called in general
\ttt{<simulation name>mg<moment group number>.dat}.

Running the \ttt{atomlaser} simulation with the \ttt{format} set to
\ttt{"ascii"} we get an output \ttt{.xsil} file of size 808~kB.
Now, if we run the \ttt{atomlaser} simulation again, except with the
\xmdsTag{output} tag set to
\begin{xmdsCode}
<output format="binary">
\end{xmdsCode}
then we get a \ttt{.xsil} file of size 4~kB, and a \ttt{.dat} file
called \ttt{atomlasermg0.dat} of size 336~kB, giving a total of 340~kB
which is 42\% smaller than with just ascii output.  Bigger savings can
be expected with longer simulations and/or simulations using more
dimensions.

\subsection{The \ttt{precision} attribute}

The default binary output is at double precision.  This is not always
necessary for output of data, especially if the data is to be
displayed graphically and then interpreted further there; the extra
precision is not necessarily worthwhile.  Therefore, there is also the
\ttt{precision} attribute available in the \xmdsTag{output} element,
with which one can set the output precision to either single or double
precision.  The syntax for this is as follows:
\begin{xmdsCode}
<output format="binary" precision="double"|"single">
  <!-- more xmds tags -->
</output>
\end{xmdsCode}
where \ttt{"double"|"single"} means the options are either
\ttt{"double"} or \ttt{"single"} with \ttt{"double"} being the default
option.  Notice that the \ttt{format} attribute is also set to
\ttt{"binary"} this is to emphasise that it is pointless specifying
the \ttt{precision} without the \ttt{format} since the \ttt{precision}
attribute is meaningless for ascii output.

Using this option, and rerunning the \ttt{atomlaser} simulation we
find that the file size of \ttt{atomlasermg0.dat} is 168~kB and
\ttt{atomlaser.xsil} is 4~kB, which overall is 21\% of the size of the
original ascii output.

\section{Initialisation of field vectors from file}

In \Chap{chap:tutFromScratch}, \Sec{sec:theFieldElement} we
initialised the field inside the \xmdsTag{vector} element by using
C/C++ code.  It is also possible to already have these vectors
calculated and stored in a file, which \xmds can then load and use to
initialise the field.  This feature can be useful if the calculation
of the vectors is particularly difficult and you don't wish for \xmds
to have to calculate them, or you may have already generated the data
from another program and so going through the hassle of getting \xmds
to recalculate the data is a waste of time.  Anyway, it can be handy
to do on some occasions and so \xmds provides a means for you to do
this via the \xmdsTag{filename} tag within the \xmdsTag{vector}
element within the \xmdsTag{field} element.  The syntax for this is:
\begin{xmdsCode}
<field>
  <vector>
    <filename format="ascii"|"binary"|"xsil">
      <!-- enter the file name here -->
    </filename>
  </vector>
<field>
\end{xmdsCode}
where \ttt{"ascii"} is the default option when the \ttt{format}
attribute is not specified.

As of \ttt{xmds-1.3}, \xmds has the ability to load binary as well as
ascii data.  Which \xmds should expect is given by the \ttt{format}
attribute of the \xmdsTag{filename} tag within the \xmdsTag{vector}
element.  Using binary input, however, doesn't significantly change
how the data should be organised prior to loading into an \xmds
simulation.
If MPI is enabled \xmds will only load into memory the appropriate part of the input file, irrespective of the file format.

\subsection{Intialisation from an XSIL file}
\label{subsec:InitialisationFromXSILFile}
\index{XSIL}
As of \ttt{xmds-1.5-3}, \xmds can initialise a vector from a moment group of an XSIL file produced by a \xmdsTag{breakpoint} tag (see \Sec{sec:Breakpoints}) or an \xmdsTag{output} tag in \xmds. If you are generating the XSIL file from an \xmdsTag{output} tag, then the output moment group must meet a certain format for \xmds to be able to understand how to load the file correctly. If the file is generated from an \xmdsTag{breakpoint} tag, then this is taken care of for you if the variables have the same names in the two simulations.

For XSIL files generated from output moment groups, the format of the XSIL file must be \ttt{"binary"} (not \ttt{"ascii"}). Also, the moment group number of the XSIL file that will be used for initialisation must be specified with the \ttt{moment\_group} attribute of the \xmdsTag{filename} tag if there is more than one moment group in the XSIL file, if there is only one moment group in the XSIL file (as is the case for XSIL files generated from a \xmdsTag{breakpoint} tag), then this attribute can be omitted. If the vector is of type \ttt{double}, then the variables of the vector are initialised from the output moment group variables of the same name but suffixed with an `R'. If the vector is of type \ttt{complex}, then the real and imaginary components of each variable are initialised by the values of the output moment group variables of the same name but with a suffix of `R' for the real component and `I' for the imaginary component. For example, the complex variables \ttt{x} and \ttt{y} would be initialised by the output moment group variables \ttt{xR}, \ttt{xI}, \ttt{yR} and \ttt{yI}, and you would use the following code in your \xmdsTag{output} tag to create these variables:
\begin{xmdsCode}
<output>
	<group>
		<sampling>
			<moments>xR xI yR yI</moments>
			<![CDATA[
				xR = x.re;
				xI = x.im;
				yR = y.re;
				yI = y.im;
				]]>
		</sampling>
	</group>
</output>
\end{xmdsCode}			
			

Not every variable in a vector need be present in the moment group of the XSIL file, as any variable that is not present is automatically initialised to zero, or by a \text{CDATA} section, as in \Chap{chap:tutFromScratch}, \Sec{sec:theFieldElement}. Although \xmds will continue initialisation even if it cannot find all the variables in a vector in the XSIL file (it will not continue if it cannot find any variables), it will print a warning about any variables that it cannot find in the XSIL file. Note that the sequence of initialisation steps for each element in a vector is to first initialise the element to zero, then to use any code in the \text{CDATA} section if present, and finally to initialise from the XSIL file if the variable is present in the moment group. Hence, initialisation from the XSIL file will override any initialisation in the \text{CDATA} section.

The dimensions of a vector can be initialised in any combination of $x$-space and $k$-space by using the \xmdsTag{fourier\_space} tag in the same way as it is used for initialisation from C/C++ code, however the default is that each dimension is initialised in $x$-space.

There are some restrictions on the geometry of the moment group in the XSIL file, however these conditions depend on whether the geometry matching mode (specified by the attribute \ttt{geometry\_matching\_mode} of the \xmdsTag{filename} tag) is set to \ttt{"strict"} mode or \ttt{"loose"} mode. In \ttt{"strict"} mode, the following conditions apply:
\begin{enumerate}
\item The moment group must have the same number of dimensions as the field. In other words, the moment group can only have been sampled once, as sampling a moment group a number of times introduces an extra dimension, the propagation dimension.
\item The moment group's dimensions must have the same name and be in the same order as those of the field.
\item If a dimension is specified as being in $x$-space ($k$-space) in the moment group, then it must be initialised in $x$-space ($k$-space). This can be done using the \xmdsTag{fourier\_space} tag.
\item Each dimension of the initialisation moment group must have the same number of points as the corresponding dimension of the field, and the start and end coordinates must be the same as those for the initialisation moment group.
\end{enumerate}
In other words, in \ttt{"strict"} mode, the geometry of the initialisation moment group must be the same (to within some small variation) as that of the field. Note that XSIL files generated by a \xmdsTag{breakpoint} tag automatically satisfy conditions 1 and 2 if the dimensions of the two simulations are the same, and in the same order. 

In the \ttt{"loose"} geometry matching mode, the last condition is relaxed to:
\begin{enumerate}
\setcounter{enumi}{3}
\item The step size in each dimension in the initialisation moment group must be the same as the step size in the corresponding dimension of the field.
\item Some of the moment group grid points must overlap (i.e.\ a vector with points at positions $x = 0, 2, 4, 6, 8$ cannot be initialised from a moment group with points at positions $x=1, 3, 5, 7, 9$.) Note that points that aren't initialised by the moment group are set to zero, or can be initialised by the \ttt{CDATA} element if set.
\end{enumerate}

The advantage of \ttt{"loose"} mode is that it allows one to break up a simulation into parts where each part requires a slightly different grid. For example, in the diffusion example in \Chap{chap:tutFromScratch}, \Sec{sec:moreComplexSimulation}, the restriction was made that the simulation is not evolved for long enough such that the field becomes non-zero at the edge of the grid. With \ttt{"loose"} mode, after running the simulation for some time on a small grid, if the state of the field is sampled at the end of the simulation, the simulation can be continued on a larger grid (though still keeping the same step size in that dimension, and ensuring that the grid points do overlap). Also, if one wishes to increase (or reduce) the number of points in a given dimension, and keep the width constant, initialise the state of the field with that dimension in $k$-space, as in this case, the requirement that the step size in the $k$-space dimension be the same is equivalent to the requirement that the width of that dimension in $x$-space remain the same. Hence, the number of points in $x$-space in that dimension can be increased (or reduced).

Note that a binary XSIL file produced on any architecture \emph{can} be used on any other architecture (byte swapping is automatically done if the endianness of the machine running the simulation is different to the endianness of the XSIL file), and XSIL files with the output in single-precision can also be used.

In summary, the syntax for initialisation of a vector from an XSIL file is:
\begin{xmdsCode}
<field>
  <vector>
    <filename format="xsil" moment_group="N" 
    		geometry_matching_mode="strict"|"loose">
      <!-- enter the file name here -->
    </filename>
    <fourier_space> <!-- yes, no, ... --> </fourier_space>
    <![CDATA[
    	// optional CDATA code
    	]]>
  </vector>
<field>
\end{xmdsCode}
where \ttt{"strict"} mode is the default geometry matching mode.

\subsection{Input data layout for ASCII and binary formats}

We now know the syntax of how to tell \xmds that we want to input data
from file, we just now need to organise the data that we are going to
input into the layout that \xmds expects to see it.  Let's see how
this works by considering a simple example.  Imagine we have three
input vectors that we want to initialise with double precision data:
\ttt{x}, \ttt{y} and \ttt{z}.  Their values are:
\begin{alltt}
x = [ -2.0 -1.0 0.0 1.0 2.0 ]
y = [ -5e-2 1e-3 -1e-5 2e-4 -7e-2 ]
z = [ 10 20 30 50 1e3 ]
\end{alltt}
We can see that they are all 5 elements long (this will equal the
\xmdsTag{lattice} assignment), and that they can contain numbers
formatted in exponential notation.  We'll save this data into a file
called \ttt{input.dat}.  \xmds expects this data to be ordered in a
particular way, which is related to the way the data is stored
internally.  This order is an interlacing of the elements of each
vector, such that the first element of the first vector (in this case
\ttt{x}) is expected as the first entry in the input file, then the
first element of the second vector (in this case \ttt{y}) then the
first element of the third vector (\ttt{z} here), and then the second
element of the first vector and so on.

One way of describing this is in terms of C/C++ code.  The data is
expected in this format:
\begin{alltt}
x[0]
y[0]
z[0]
x[1]
y[1]
z[1]
\end{alltt}
and so on until the end of the data.  Another way of describing this
is in terms of the actual data, and so here is how the file
\ttt{input.dat} will look:
\begin{alltt}
-2.0
-5e-2
10
-1.0
1e-3
20
0.0
-1e-5
30
1.0
2e-4
50
2.0
-7e-4
1e3
\end{alltt}
If this seems unnecessarily complicated---it is.  However, this is the
way the data is expected and so we have to behave the way \xmds
expects otherwise our simulation will not work properly.  As it turns
out, storing the data within memory in this fashion means that
calculations are performed on contiguous blocks of memory, and
therefore are a lot faster than if entire vectors were stored with
their elements next to one another.  This is a significant point for
the memory utilisation internal to the simulation and for maintaining
the speed of \xmds simulations.  However, it may be possible in future
versions of \xmds for the input data to be specified more logically
(i.e.~have \ttt{x} defined first, then \ttt{y} etc.) and then for the
simulation to reorganise the data internally so that calculations are
performed efficiently and quickly.

If your input data is binary instead of ascii (as we have above), then
you would use the \ttt{format="binary"} assignment in the
\xmdsTag{filename} tag, and then \xmds would expect the data to be a
string of double precison numbers in the same order as that given above.

\subsection{Importing complex data}

If we want to import complex data, we just specify the real then
imaginary parts sequentially as pairs of data.  Imagine that we now
have two vectors (so we don't have to consider so many vectors) called
\ttt{x} and \ttt{y}.  They have values of (just for the sake of argument)
\begin{alltt}
x = [ 1.2+2.0i 7.5+0.0i ]
y = [ -5e-2+10i 7e10-8e-7i ]
\end{alltt}
and they will be organised in the input file as follows:
\begin{alltt}
real(x[0])
imag(x[0])
real(y[0])
imag(y[0])
real(x[1])
imag(x[1])
real(y[1])
imag(y[1])
\end{alltt}
which is
\begin{alltt}
1.2
2.0
-5e-2
10
7.5
0.0
7e10
-8e-7
\end{alltt}

With complex data the binary input method is slightly different.  The
assignment to the \ttt{format} attribute is the same
(i.e.~\ttt{"binary"}), however, instead of separating the real and
imaginary parts of the complex numbers that are to be read in, one
just has the binary representation of the complex number to be read.
So in a sense, the binary input of complex data is exactly the same as
that of double data, except that the data is complex and not double
(which seems obvious, but it sort of had to be said).

\section{Command line arguments}

Do you want to run your simulation many, many times ranging over
several different global parameters?  If the answer is yes, then the
command line argument feature of \xmds is for you.  In versions of
\xmds before \ttt{xmds-1.2} to be able to map a parameter space, or
run the program over many different values of a simple global
variable, you had to modify your script, rerun \xmds (with its implied
compliation step) and then run the simulation \emph{for each value}.
This, put plainly sucked, so we put in a way to pass arguments to the
simulation binary executable, enabling us to write a simple shell
script (or Perl or Python) to run our simulation over many different
values.  This removed the need to recompile the simulation again and
again, and generally speaking speeds things up and takes (at least
some of) the pain out of doing things like mapping parameter spaces.

So, how do we tell \xmds to make the simulation accept command line
arguments?  You do this with an \xmdsTag{argv} tagset, which you put
somewhere before the \xmdsTag{globals} element.  For those of you who
have worked with C before and passing arguments to programs will
notice that we've used the \ttt{argv} name here for the list of
arguments the program will accept, in exactly the same way that C
programming does by convention.  To set up this list, we need to
specify, the arguments, and the relevant properties of the arguments.
As such we need to tell \xmds what the name of the argument is, its
data type, and its default value (for the instances when we don't want
to specify the value on the command line).  As might be obvious here,
we have a nested structure of information, and hence the corresponding
\xmds code is similarly nested.  The syntax of adding command line
arguments to simulations is as follows:
\begin{xmdsCode}
<argv>
  <arg>
    <name> </name>                   <!-- the argument name -->
    <type> </type>                   <!-- data type of arg -->
    <default_value> </default_value> <!-- the default value -->
  </arg>
  <!-- more arg definitions here if necessary -->
<argv>
\end{xmdsCode}

We'll now go through an example to show you how to use this feature,
and some of the subtleties of using command line arguments with
\xmds-derived simulations.  Let's revisit the \ttt{diffusion}
simulation discussed in \Chap{chap:tutFromScratch},
\Sec{sec:moreComplexSimulation}.  The main use of command line
arguments is to be able to replace variables given in the
\xmdsTag{globals} element.  Therefore, let's change the diffusion
coefficient $\kappa$ (\ttt{kappa} in the code) to be an argument to
the simulation.  We do this by adding the following code before the
\xmdsTag{globals} element, and by commenting out the \ttt{kappa}
declaration and assignment in the existing code.  The \xmds code then
becomes:
\begin{xmdsCode}
<simulation>
  <!-- global parameters and functionality tags in here -->

  <!-- Command line arguments -->
  <argv>
    <arg>
      <name> kappa </name>
      <type> double </type>
      <default_value> 0.1 </default_value>
    </arg>
  </argv>

  <!-- Global variables for the simulation -->
  <globals>
  <![CDATA[
    // const double kappa = 0.1;  // diffusion coefficient
    const double sigma = 0.1;     // std dev of initial Gaussian
    const double x0 = 0.0;        // mean pos of initial Gaussian
  ]]>
  </globals>

  <!-- remainder of diffusion simulation xmds code -->
</simulation>
\end{xmdsCode}
Notice that we've commented out the \ttt{kappa} variable using the C++
line comment style.  This is just to remind us that \ttt{kappa} used
to be there and is no longer, and what it was when we originally wrote
the simulation.  It can be a good idea to keep this kind of
information around if you want, but it isn't necessary, and because
it's a comment it will be ignored by the C/C++ compiler.  Of course,
if you \emph{don't} comment the global declaration out, then the C/C++
compiler will throw an error and your simulation won't compile.

Running \xmds on the file \ttt{diffusion.xmds} now gives a simulation
binary that can accept arguments.  You can try it out by running the
simulation like so:
\begin{shellCode}
% diffusion --kappa 0.2
\end{shellCode}
where we have run the \ttt{diffusion} simulation with \ttt{kappa} now
set to 0.2.

\xmds uses the GNU \ttt{getopt} set of functions to implement
arguments, and as such supports both short and long option names.
Therefore, the above example could have been run as
\begin{shellCode}
% diffusion -k 0.2
\end{shellCode}
So, at the simplest level, \xmds takes the long form of the argument
name as the actual name of the variable, and takes the first character
of the variable name for the short form of the argument.  But what
happens when you have two variables to be entered at the command line
that start with the letter `k'?  What \xmds does to solve this problem
is, if a variable already has a short option taken (e.g.~if we had
already defined another variable in the \xmdsTag{argv} list called say
\ttt{kruntsch}), then the next character is used for the short option,
which would be the letter `a' for \ttt{kappa}.  Of course, if this
letter is taken then \xmds searches for a single character
representation of \ttt{kappa} throughout the variable name until it
finds one that isn't used.  If \xmds doesn't find a short option that
isn't used, then it throws an error.  

Assuming that everything has worked ok, and the assignments to the
short options have worked properly, how can one find out what the
short option is if it has changed?  Well, you can simply ask the
simulation for help.  Just run the simulation with either \ttt{-h} or
\ttt{--help} and it will print out the usage of the simulation and a
list of the option names, their data type, and default value.  For
instance, asking the \ttt{diffusion} simulation for help we get the
following output:
\begin{shellCode}
% diffusion --help
Usage: diffusion -k < double >

Details:
Option          Type            Default value
-k, --kappa     double          0.1
\end{shellCode}
So, we call \ttt{diffusion} with \ttt{-k} and the simulation is
expecting a double precision number after the \ttt{-k} flag.  Also, we
are told that either \ttt{-k} or \ttt{--kappa} are possible options
(but we already knew that anyway), and that \ttt{kappa} is a double
precision number of default value \ttt{0.1}.

And that's it!  At present \xmds can accept \ttt{int}, \ttt{double},
\ttt{float}, and \ttt{char *} for command line arguments.  Complex
numbers aren't yet implemented (as of \ttt{xmds-1.3}) but may be added
in a future version.

Now, imagine that we wanted to run the \ttt{diffusion} simulation over
a range of values starting from \ttt{0.1} to \ttt{1.0}.  To do this we
could write a simple shell script as follows:
\begin{shellCode}
#!/bin/sh

for i in 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
do
  echo "Running diffusion with kappa = $i"
  echo "i.e. diffusion --kappa $i"
  diffusion --kappa "$i"
  mv diffusion.xsil "diffusion_$i.xsil"
done
\end{shellCode}
Notice that we've moved the output data file to a new filename since
the file \ttt{diffusion.xsil} will be produced each time the
simulation it is run, and hence our data it would get written over
with new data each time the simulation is run, had we not bothered to
rename the \ttt{.xsil} file.

Equivalently, we could have used a Perl script to do the same thing,
for instance:
\begin{perlCode}
#!/usr/bin/perl -w

use strict;
my $i = 0.1;
while ($i <= 1.0) {
  print "Running diffusion with kappa = $i\n";
  print "i.e. diffusion --kappa $i\n";
  my @args = ("diffusion", "--kappa", $i);
  system(@args);
  `mv diffusion.xsil diffusion_$i.xsil`;
  $i = $i + 0.1;
}
\end{perlCode}

Feel free now to extend (and have a play with) \ttt{diffusion.xmds}
For example, change the simulation to make it possible to vary
\ttt{sigma} (and even \ttt{x0}) and see how the output changes, and in
what regimes the assumption that our window size is large enough that
the implicit periodic boundary conditions are unimportant (for a
discussion of what this last part of the sentence means, have a look
at \Sec{sec:moreComplexSimulation}.

\section{Preferences}
\label{sec:preferences}

As of \ttt{xmds-1.3-1}, there has been the ability to have preferences
specific to a given user.  This gives the user more flexibility than
before as they can control how simulations are built without having to
recompile and reinstall \xmds.  Also, if \xmds was installed by the
root user, a non-root user of the system can modify how their
simulations are built without having to reinstall the system binary
and therefore alter how all other users' programs are built.

\subsection{Turning preferences on and off}

By default preferences are on, and are used if the preferences file
can be found.  If the preferences file cannot be found, or if
\xmdsTag{use\_prefs} is set to \ttt{no} then the default settings
defined at configuration and installation of \xmds will be used.  This
is also the case for preference flags that are not specified in the
preferences file: the default settings will be used.  Therefore, one
doesn't need to specify all of the settings able to be used, just the
ones one wishes to change.

The \xmdsTag{use\_prefs} tag is set in the global configuration
section of the \xmds simulation script, namely at the beginning with
the \xmdsTag{name}, \xmdsTag{prop\_dim} etc. tags.

If \xmdsTag{use\_prefs} is set to \ttt{yes} explicitly (or not set at
all) then \xmds will search for a preferences file.  This file is
called \ttt{xmds.prefs} and can reside either in the user's
\ttt{\$HOME/.xmds} directory or the directory local to the simulation
script being processed.  \xmds searches for the file in the user's
home directory first and then in the current directory.

\subsection{Setting preferences}

Preferences are set in the \ttt{xmds.prefs} file by using key-value
pairs delimited by an equals sign (\ttt{=}).  Therefore, in general,
the format is:
\begin{alltt}
key = value
\end{alltt}

Spaces around the equals sign are ignored, such that the following are
all equivalent:
\begin{alltt}
key=value
key =value
key= value
key = value
\end{alltt}

The hash character (\ttt{\#}) is used for comments; anything after and
including the \ttt{\#} are ignored when parsing the options.  So, one
can make what is happening much clearer what the flags are to do.  For
example:
\begin{alltt}
# this is my funky preferences file
option = some_funky_value some_other_funky_value
other_option = also_quite_funky_variable  # isn't this variable funky?
\end{alltt}
the first line will be ignored and the text including and after the
\ttt{\#} on the third line will be ignored.

\subsection{What are the options?}

The options for building simulations that \xmds accepts are:
\begin{itemise}
  \item \xmds options:
    \begin{description}
    \item[\ttt{XMDS\_CC}] the C (C++) compiler \xmds will use.
      Typical options include: \ttt{cc}, \ttt{gcc}, \ttt{g++}.
    \item[\ttt{XMDS\_CFLAGS}] the flags passed to the C (C++)
      compiler.  For those who are using the \ttt{make} utility, then
      these flags are the same as the \ttt{CFLAGS} variable ordinarily
      passed to the C compiler.
    \item[\ttt{XMDS\_LIBS}] the libraries and library paths necessary
      to build the simulation.  Again, for those familiar with
      \ttt{make} this is the same as the \ttt{LIBS} variable.
    \item[\ttt{XMDS\_INCLUDES}] the include paths and files for the C
      (C++) compiler to look for when compiling the simulation.
    \end{description}
  \item MPI options:
    \begin{description}
    \item[\ttt{MPICC}] the C (C++) compiler used by the local MPI
      implementation to compile the simulation.  Often this is just
      \ttt{mpicc}, but on systems such as the APAC supercomputer in
      Canberra, Australia, this actually is just \ttt{cc}.
    \item[\ttt{MPICCFLAGS}] the \ttt{CFLAGS} variable to be passed to
      the MPI C (C++) compiler.
    \item[\ttt{MPILIBS}] the extra libraries necessary to compile a
      simulation for use with MPI.  For instance, on cluster systems
      an MPI implementation such as LAM will be used.  In this case,
      the extra libraries necessary to compile a simulation will be
      something like: \ttt{-lmpi -llam}.
    \end{description}
  \item FFTW options:
    \begin{description}
      \item[\ttt{FFTW\_LIBS}] the libraries and library paths specific
	to your fftw installation.
      \item[\ttt{FFTW\_MPI\_LIBS}] the libraries and library paths
	specific to your fftw installation necessary so that you can
	use fftw with MPI.  Warning: only use fftw with a
	supercomputer.  To perform Fourier transforms in parallel, a
	lot of communication between the nodes is necessary, hence
	this is only worthwhile on a supercomputer with a high speed
	network connection between nodes.
      \item[\ttt{FFTW3\_LIBS}] the libraries and library paths specific to your installation of fftw version 3.
      \item[\ttt{FFTW3\_THREADLIBS}] the additional libraries and library search paths required for using threaded fftw3 simulations.
    \end{description}
  \item User defined options (at configuration):
    \begin{description}
      \item[\ttt{USER\_LIB}] if \xmds has been installed in the user's
	home directory, then this flag needs to be specified so that
	\xmds can find the \xmds-specific libraries when building a
	simulation.
      \item[\ttt{USER\_INCLUDE}] if \xmds has been installed in the
	user's home directory, then this flag needs to be specified so
	that \xmds can find the \xmds-specific header files when
	building a simulation.
    \end{description}
\end{itemise}

\subsection{Examples of changing options}

\begin{description}
\item[Using \ttt{gcc} for \ttt{g++}:] When configuring \xmds before
  installation, the configuration script often sets the default C/C++
  compiler for \xmds to be \ttt{g++}.  This is the GNU C++ compiler.
  Sometimes this is not desirable, and so one may wish to use the GNU
  C compiler (\ttt{gcc}) instead.  To do this, one needs to change two
  things: the \ttt{XMDS\_CC} setting, and the \ttt{XMDS\_LIBS}
  setting.  In the \ttt{xmds.prefs} file one then sets \ttt{XMDS\_CC}
  to \ttt{gcc}, and appends \ttt{-lstdc++} to the list of flags
  already given for \ttt{XMDS\_LIBS} at installation.  The addition of
  \ttt{-lstdc++} is so that \ttt{gcc} can make use of the C++
  extensions to \ttt{gcc} so that it can actually compile the
  simulation (which is in some sense a mix of C and C++).
\item[Using \ttt{icc} for \ttt{g++}:] An alternative C++ compiler to
  \ttt{g++} is the Intel C++ Compiler: \ttt{icc}.  To use this
  compiler instead of \ttt{g++} or \ttt{gcc} (assuming of course that
  you have the compiler installed on your system) just set
  \ttt{XMDS\_CC} to \ttt{icc} and prepend \ttt{-limf} to
  \ttt{XMDS\_LIBS} (this adds the \ttt{icc} native support for its
  maths libraries).
\item[Debugging:] By default, \xmds is configured to use quite
  aggressive optimisations when compiling simulations.  If, however,
  you suspect something is going wrong and you wish to debug the
  simulation binary directly (using \ttt{gdb}, \ttt{dbx} or another
  symbolic debugger), then you will need to put symbolic debugging
  information into the binary executable.  To do this, replace the
  default options by setting the \ttt{XMDS\_CFLAGS} variable to \ttt{-g}.
\item[Profiling:] There may be instances when one wants to find out
  what part of the code is taking up the most time when running a
  simulation.  This is generally speaking a part of debugging and
  testing a simulation and not normally part of using \xmds.  However,
  if you're interested in seeing what lines of code are using the most
  time, you'll want to add profiling information to the code.  To do
  this you will need to add either the \ttt{-p} or \ttt{-gp} option to
  the \ttt{XMDS\_CFLAGS} variable.  The \ttt{-p} option generates
  extra code for profiling with the \ttt{prof} utility, and the
  \ttt{-gp} option generates extra code for profiling with the
  \ttt{gprof} utility.
\end{description}

\section{Breakpoints}
\label{sec:Breakpoints}
Breakpoint elements are parts of a simulation (similar to an \xmdsTag{integrate} or a \xmdsTag{filter} element) that cause the state of some vectors of the simulation to be saved to an XSIL file when the breakpoint element is hit. This can be used early in a simulation to enable you to check that, for example, the simulation isn't running off the grid, or that the behaviour is wrong and the simulation needs to be terminated. This way, much time can be saved waiting for the result of a long simulation that needs to be re-run anyway.

Another use of breakpoint elements is to save the state of some (or all) vectors to an XSIL file for loading by another simulation (as discussed earlier in \Sec{subsec:InitialisationFromXSILFile} of this chapter). The naming convention for the vectors (and components of complex vectors) in the XSIL file produced is the same as that used for loading XSIL files as described in \Sec{subsec:InitialisationFromXSILFile}.

Although creating an XSIL file to be used for initialising another simulation can be achieved almost as easily with an output moment group, breakpoints should be used instead of output moment groups for large deterministic simulations that use MPI. Currently (\ttt{xmds-1.5-2}), because of the way in which output moment groups are sampled with MPI, each node allocates the entire memory required for sampling each output moment group. This means that sampling the entire field for a large simulation that will not fit into the memory of a single node is impossible, and hence creating an XSIL file from which the simulation could be continued is also impossible. As the intended use of moment groups is that they should be used to sample a small amount of data, an alternative solution was required for this situation. Breakpoint elements have been designed such that the additional memory use when used in a deterministic simulation with MPI is equal to the size of the field stored on any given node (and only while the XSIL file is being written), instead of the total field size (for the entire simulation).

The syntax for a breakpoint element (this should be in a \xmdsTag{sequence} element) is:
\begin{xmdsCode}
<breakpoint>
	<filename> <!-- XSIL filename for output, 
	                e.g. simulation.xsil--> </filename>
	<fourier_space> <!-- yes, no, ... -->  </fourier_space>
	<vectors> <!-- list of vectors to be saved to the file -->
	</vectors>
</breakpoint>
\end{xmdsCode}

% advanced topics??  
% What about wisdom, binary output, binary input, single and double
% precision output, benchmarking of the code, error checking,
% stochastic simulations, cross propagating fields, post-processing,
% manipulation of data in Fourier space etc, etc.