File: develAndProgStructure.tex

package info (click to toggle)
xmds-doc 0~svn.1884-3.1
links: PTS, VCS
area: main
in suites: jessie, jessie-kfreebsd, wheezy
size: 8,336 kB
ctags: 192
sloc: makefile: 135; python: 55
file content (220 lines) | stat: -rw-r--r-- 10,662 bytes
parent folder | download | duplicates (3)
% $Id: develAndProgStructure.tex 1339 2007-05-09 05:07:30Z joehope $

% Copyright (C) 2000-2007
%
% Code contributed by Greg Collecutt, Joseph Hope and the xmds-devel team
%
% This file is part of xmds.
%
% This program is free software; you can redistribute it and/or
% modify it under the terms of the GNU General Public License
% as published by the Free Software Foundation; either version 2
% of the License, or (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program; if not, write to the Free Software 
% Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.

\chapter{Development and Program Structure} 
\label{chap:codeStructure}

It only takes a few items of information to define a numerical
simulation, yet the list of instructions necessary to get a computer
to calculate the solution is vast in comparison. Going from the former
to the latter is non-trivial. It can be done in a single step (as far
as the user is concerned) with high level programming languages, but
most computer programmers know from experience that the numerical
solution can be calculated more efficiently by writing comparatively
more code in a low level script, than less code in a high level
script. This is because the programmer knows the purpose of the
program, and can optimise it accordingly as the list of instructions
is expanded. On the other hand a compiler can only perform
optimisations that it knows are guaranteed to work regardless of the
purpose of the program.

By restricting the scope of programs to those with a known range of
purposes, one is able to write a compiler that generates more
efficient output code. In other words the possible input code is
constrained to a very small set of instructions which, even though
they may be complex, have a well defined expansion into a simpler
instruction set. Further, it is not necessary to define the expansion
down to the level of assembly code -- one only needs to define the
expansion to a point where it no longer relies critically on the
compiler's optimisation ability. This is how \xmds works:
transform from specialised input script to {\em efficient} C code, and
then let an ordinary compiler do the rest.

In choosing the high level syntax we have attempted to employ some
degree of foresight: the future is likely to see more interconnection
of computer networks around the world, and information is already
taking a standardised form for transfer. Until now this has primarily
been HTML (Hyper Text Mark-up Language), but now there is increasing
interest in an ``extensible'' mark-up language of which HTML is only a
subset. This is known as XML (eXtensible Mark-up
Language)~\cite{web:XML_syntax,Harold:2002:1}. Also, a standard for data
interchange has been developed at Caltech~\cite{web:caltech_xsil} which is
a subset of XML: XSIL or eXtensible Scientific Interchange Language,
and this has been chosen as the format for field data input/output.

Using XML for the input script has another significant advantage: the
format for the simulation data takes on a tree-like structure which
can be extended and made as complex as is necessary. This is the
extensible property of XML. Appropriately named element tags are used
to mark-up the essential data, and may be nested accordingly to group
the information. The result is a logical human-readable script with
data represented in a manner preserving both synthetic and sequential
relationships.

\xmds has undergone a few development iterations for the purpose of
refinement and extension. There have been two particularly difficult
aspects. The first was in defining an appropriate equation
syntax. This syntax had to be in C code style since \xmds merely
transplants these equations into the output C code (which it then
compiles), and yet it had to be capable of representing equations of
the form shown in \eqn{eq:xmdsPdeEx}. Further, it was desired that the
C code form of such an equation must be the same regardless of the
type of integration algorithm chosen. This difficulty was compounded
by the desire to include split-operator algorithms, which process the
linear and nonlinear terms entirely separately. The result that we
have achieved here is quite elegant, though there are a few caveats
with regard to writing the equations.

The second area of difficulty was enabling the user to define exactly
what they wanted as output. The simplest solution would have been to
save the raw field data to file as the simulation progressed, and the
user left to process it afterward in their own preferred plotting and
calculation package. However, with stochastic problems the user
usually wishes to calculate the average of some property or {\em
moment} of the field over many different {\em trajectories} or
integration runs. Therefore it was necessary to enable to user to
define such moments so that they may be calculated and averaged before
the output data is written to file.

The source code for \xmds is written in object oriented C, otherwise
known as C++. The main routine is very simple, and its procedure is as
shown on the left side in \fig{fig:xmdsMain}. The right side is
performed by the simulation data class.

\begin{figure}[ht]
\centerline{\includegraphics[width=\figwidth]{figures/main}}
\caption{\xmds main procedure}
\label{fig:xmdsMain}
\end{figure}

A non-validating XML parser processes the input file and populates a
node tree based on the Document Object Model~\cite{web:dom3}. This parser
was written by the author, but there is little need to describe it in
detail here. This node tree is then passed to a simulation class
object which extracts its own relevant data, and in turn creates child
objects to process the relevant sub-trees.  The class structure of
\xmds is shown in \fig{fig:xmdsClasses}.

\begin{figure}[ht]
\centerline{\includegraphics[width=\textwidth]{figures/classes}}
\caption{\xmds class hierarchy}
\label{fig:xmdsClasses}
\end{figure}

Once the input file has been processed, a ``dry run'' of the main
sequence tree is called in order to evaluate the total number of
samples requested for each output moment group. Finally the output
code is generated. This is accomplished with a ``tree walking''
technique. To begin with the simulation object writes any include
statements that are necessary. Then it writes any define statements
particular to itself, and then calls each of its child objects to do
the same. This process is continued down the tree until every object
in the tree has written its define statements. Starting again at the
simulation element, the process is repeated to write the global
variables, the routine prototypes, and finally the routines
themselves. This tree walking is performed by the Element class with
the virtual functions, aptly titled, \ttt{writeDefines()},
\ttt{writeGlobals()}, \ttt{writePrototypes()}, and
\ttt{writeRoutines()}. These routines are then overridden in the
derived classes to write the code specific to each class.

As mentioned in the previous section, one of the primary areas of
difficulty was developing a C-code style syntax which would allow a
large range of complex PDEs to be encoded, and in a manner that
remains independent of the exact integrate algorithm chosen.  This was
implemented using a two step process. To begin with, the user is
required to write equations with any differential operators replaced
by an operator[component] representation. For example, the NLSE shown
in \eqn{eq:xmdsNlse1},
\begin{equation}
\frac{\partial \phi}{\partial z } = i\left[\frac{\partial ^{2}
\phi}{\partial t ^{2}} + |\phi|^{2} \phi \right],
\label{eq:xmdsNlse1}
\end{equation}
would be written in C-code style as:
\begin{CCode}
dphi_dz =  L[phi] + i*~phi*phi*phi;
\end{CCode}
where the operator \ttt{L} is defined in Fourier space as being
\begin{CCode}
L = -i*kt*kt/2;
\end{CCode}

The first step is to search the user's equations for such
operator[component] expressions, and replace them with something else
depending on the algorithm chosen. In the explicit picture they are
replaced with a reference to an internally calculated vector (which is
exactly the derivative calculated with the Fourier transform method),
whereas in the interaction picture the field is evolved in Fourier
space according to the operator[component] expressions found, and in
the original code the text strings for these expressions are replaced
with zero. The second step is to use \ttt{\#define} macros to map the
equation variables to the internal data arrays within the generated
code. Thus for the example above, using an explicit picture method the
output code in the derivative calculation routine becomes:
\begin{CCode}
void _sg1_calculate_delta_a(
     const double& _step) {
complex dphi_dz;
_main_main_go_space(0);
_main_sg1_coterms_go_space(0);
unsigned long _main_main_pointer=0;
unsigned long _main_sg1_coterms_pointer=0;
double t = _main_xmin0;
for(unsigned long _i0=0; _i0<_main_lattice0; _i0++) {
//************** propagation code **************
dphi_dz =  _sg1_coterm_L_phi + i*~phi*phi*phi;
//**********************************************
     _main_main[_main_main_pointer + 0] = dphi_dz*_step;
     _main_main_pointer += _main_main_ncomponents;
     _main_sg1_coterms_pointer += _main_sg1_coterms_ncomponents;
     t += _main_dx0;
     }
}
\end{CCode}
with the following relevant define statements placed at the head of
the file:
\begin{CCode}
// field main defines
#define _main_ndims 1
#define _main_lattice0 100
#define _main_xmin0 -5.000000e+00
#define _main_dx0 ((5.000000e+00 - -5.000000e+00)/(double)100)
#define _main_dk0 (2*M_PI/(5.000000e+00 - -5.000000e+00))
#define dt _main_dx0
#define dkt _main_dk0
// vector main defines
#define _main_main_ncomponents 1
#define phi _main_main[_main_main_pointer + 0]
// vector sg1_coterms defines
#define _main_sg1_coterms_ncomponents 1
#define _sg1_coterm_L_phi _main_sg1_coterms[_main_sg1_coterms_pointer+0]
\end{CCode}

This does not make for compact code, but it was desired that the
generated code remain readable by the user so that they may inspect it
to see exactly what \xmds is doing in regards to solving their
problem. It also has the added advantage that, should a user wish to
do something rare and obscure, they only need to write a short script
to generate a base C-code listing which they may then alter to suit
their problem.