File: formrun.tex

package info (click to toggle)
form 4.2.1%2Bgit20200217-1
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 5,500 kB
  • sloc: ansic: 101,613; cpp: 9,375; sh: 1,582; makefile: 505
file content (568 lines) | stat: -rw-r--r-- 31,501 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
\section{Discussion of a typical \FORM\ run}

We discuss in the following what is happening inside \FORM\ when it executes a
given program. The discussion focuses more on the interplay between the various
parts of \FORM\ and on key concepts of the internal data representation than on
in-depth details of the code. For the latter, the reader is referred to section
\ref{sec:indepth}. This section should for better comprehension be read with the
referenced \FORM\ source files opened aside.

We consider the following exemplary \FORM\ program \C{test.frm} (which we run
with the command "\C{form test}"):

\begin{verbatim}
 1      #define N "3"
 2      
 3      Symbol x, y, z;
 4      
 5      L	f = (x+y)^2 - (x+z)^`N';
 6      L	g = f - x;
 7      
 8      Brackets x;
 9      Print;
10      .sort
11      
12      #do i=2,3
13      Id	x?^`i' = x;
14      #enddo
15      
16      Print +s;
17      .end
\end{verbatim}

The entry function \C{main()} is in \C{startup.c}. It does various
initializations before it calls the preprocessor \C{PreProcessor()}, which
actually deals with the \FORM\ program. The code shows some typical features:
Preprocessor macros are frequently used to select code specific to certain
configurations. The two most common macros can be seen here: \C{WITHPTHREADS}
for a \TFORM\ executable and \C{PARALLEL} for a \PARFORM\ executable. Macros are
used to access the global data contained in the variable \C{A}, like 
\C{AX.timeout} for example. The code uses (usually) own functions instead of
standard functions provided by the C library for common tasks. Examples in
\C{main()} are \C{strDup1} or \C{MesPrint} (replacing \C{printf()}). Another
very often used function is \C{Malloc1()} replacing \C{malloc()}. The reasons
are better portability and the inclusion of special features.  \C{Malloc1()} for
example makes a custom memory debugger available while \C{MesPrint()} knows
among other things how to print encoded expressions from the internal buffers.

% needs to be rewritten ->
The initializations in \C{main()} are done in several steps. Some like the
initialization of \C{A} with zeros is done directly, most others are done by
calls to dedicated functions. The initializations are split up according to the
type of objects involved and the available information at this point. The
command line parameters passed to \FORM\ (none in our example run) are treated
in the function \C{DoTail()}. After that, files are opened and also parsed for
addtional settings. Then, as all settings are known, the large part of the
internal data is allocated and initialized. Finally, recovery settings are
checked, threads are started if necessary, timers are started, and variable
initializations that might need to be repeated later (e.g. clear modules) are
done in \C{IniVars()}.

The call to \C{OpenInput()} reads the actual \FORM\ program into memory. The
input is handled in an abstract fashion as character streams. The stream implementation
(\C{tools.c}) offers several functions to open, close, and read from a stream.
Streams can be of different types including files, in-memory data like parts of
other streams or dollar variables, as well as external channels. The access to
the characters in all streams though is nicely uniform. In
\C{OpenInput()} a stream is representing our input file. Most of the logic
there deals with the jump to the requested module (skipping clear instructions).
It uses the function \C{GetInput()} to get the next character in the stream.
Which stream it reads from is determined by the variable \C{AC.CurrentStream}.
This global variable in the sub-struct \C{C\_const} of the \C{ALLGLOBALS}
variable \C{A} is an example of how the different parts of \FORM\ typically
communicate with each other by means of global variables.

Next is the preprocessor. The preprocessor is implemented in the function
\C{PreProcessor()} in \C{pre.c}. This function consists basically of two nested
for-loops without conditions (\C{for (;;) \{ \ldots \}}). The outer loop deals
with one \FORM\ module for each iteration, the inner loop deals with one input
line. We have certain initializations done before in our example the code runs
into the inner loop, where \C{GetInput()} reads our input file. The variables
are all set such that the reading starts from the beginning of out input file.

The input in variable \C{c} is tested for special cases. Whitespaces are
skipped. Comments starting with a star \C{*} (unless \C{AP.ComChar} is set to a
different character) are also skipped including whole folds. The crucial check
on \C{c} is the if-clause that checks it for being a preprocessor command (\C{\#}),
a module statement (\C{.}), or something else which is usually an ordinary
statement.

\begin{verbatim}
 1      #define N "3"
\end{verbatim}

In our case, we have a preprocessor command in the input. The function
\C{PreProInstruction()} is called to read and interpret the rest of the line.
The first part deals with the loading of the command in a dedicated buffer. For
the moment, we ignore the details for the special treatment of cases when we are
already inside a if or switch clause in a \FORM\ program. In our run, the
function \C{LoadInstruction(0)} is simply called.

\C{LoadInstruction()} copies input into the preprocessor instruction buffer.
Three variables govern this buffer: \C{AP.preStart} points to the start of the
buffer, \C{AP.preFill} to the point where new input can be copied to, and
\C{AP.preStop} to (roughly) the end of the buffer. This setup is quite typical
for buffers in \FORM. The memory is allocated at the start of \FORM. Later, like
at the end of \C{LoadInstruction()}, if the buffer gets to small, it can be
replaced by a larger memory patch with the help of utility functions like
\C{DoubleLList()}. The contents is copied from the old to the new buffer. Since
this dynamical resizing of buffers needs to be done with most buffers
occationally, most buffers in \FORM\ store data such that it easily allows for
copying, i.e.  usually C pointers are avoided and instead numbers representing
offsets are used. Since the preprocessor instruction buffer just contains
characters there is no problem here.

In \C{LoadInstruction()} with our input and the mode set to 5 the input is just
copied directly without any special actions taking except for a zero that is
added at the end of the data. \C{PreProInstruction()} examines the data in the
preprocessor instruction buffer for special cases, and then does a look-up in
the \C{precommands} variable. This is a vector of type \C{KEYWORD} which enables
the translation of a string (the command) to a function pointer (the C function
that performs the operations requested by preprocessor command).
\C{FindKeyword()} does these translations and the found function pointer is then
dereferenced with the rest of the input in the instruction buffer as an argument.

The function pointer will point to \C{DoDefine()} in our case. \C{DoDefine()}
just calls \C{TheDefine()} that does the work. The if-clauses for \\
\C{AP.PreSwitchModes} and \C{AP.PreIfStack} are present in most of the
functions dealing with preprocessor commands. They check whether we are in a
preprocessor if or switch block that is not to be considered, because the
condition didn't hold. Then, the standard action is to just exit the current
function leaving it with no effect. Since there are preprocessor commands like
\C{\#else} or \C{\#endif} this decision can only be taken at this level of the
execution and requires the repeated use of this idiom.

The function scans through possible arguments and the value. In the value, special
characters are interpreted. Ultimately, the preprocessor variable is created and
assigned in the called function \C{PutPreVar()}. The variable \C{chartype}
deserves an explanation. One will find it used very often in the C code that
does input parsing. \C{chartype} is actually a macro standing in for
\C{FG.cTable}. This global, statically initialized (in \C{inivar.h}) vector
contains a value of every possible ASCII character describing its parsing type.
The parsing type groups different ASCII characters such that the syntax checking
is facilitated, see \C{inivar.h} for details.

In \C{PutPreVar()} we get into the details of the name administration. We will
just comment on some of the more general features. \C{NumPre} and \C{PreVar} are
macros to access elements in \C{AP.PreVarList}. The type of \C{AP.PreVarList} is
\C{LIST}. This is a generic type for all kinds of lists and it is used for many
other variables in \FORM. A \C{LIST} stores list entries in a piece of
dynamically allocated memory that has no defined type (\C{void *}). The utility
functions for managing \C{LIST}s like \C{FromList()} are ignorant about the
actual contents and perform list-specific operations like adding, removing or
resizing a list. An actual entry can be accessed by some pointer arithmetic and
type casting. The \C{PreVar} macro contains such a cast to the type \C{PREVAR}
which represents a preprocessor variable.

\C{PutPreVar()} creates a new list entry for us and basically copies the
contents of the parameter \C{value} to the memory allocated to \C{PREVAR}'s
\C{name}. So, by writing \C{PreVar[0]->name} or \C{PreVar[0]->value} we could
access the strings \C{N} or \C{3}.

In \C{TheDefine()} the function \C{Terminate()} is used several times. This
function ultimately exits the program, but first tries to clean up things and
print information about the problems causing this program termination.

\begin{verbatim}
 2      
 3      Symbol x, y, z;
\end{verbatim}

In our run, we return to the function \C{PreProcessor()} and start a new inner
loop iteration that reads a new line. After skipping the empty line we end up
in the else-branch of the big if-clause testing \C{c} this time. Here the major
steps are: we check again whether we are in a preprocessor if or switch, call
\C{LoadStatement()} to read and prepare the input, and call
\C{CompileStatement()} to perform the actions requested by the statement. Th
programs enters the compiler stage.

We also see a call to \C{UngetChar()}, which puts back the character that has
been read into the input stream. This is necessary, because \\
\C{LoadStatement()} and \C{CompileStatement()} need the complete line for
parsing. The variable \C{AP.PreContinuation} is used several times. This variable
deals with statements that span several input lines. \C{LoadStatement()} can
recognize unfinished statements and sets this variable accordingly.

\C{LoadStatement()} basically copies the input to the compiler's input buffer at
\C{AC.iBuffer} (which has \C{AC.iPointer} and \C{AC.iStop} associated to it). It
modifies the copy if necessary. The modification are to replace spaces by commas
or insert commas at teh right spots to separate tokens. The interpretation steps
that are following rely on these synactic conventions.

The call to \C{CompileStatement()} is done only if no errors occured and all
lines of a statement have been gathered into the compiler's input buffer.
\C{CompileStatement()} is called with the address of this input buffer and tries
to identify the statement. Like in the preprocessor, the input string is search
in a vector of \C{KEYWORD}s (in \C{compiler.c} and if found, a function pointer
is dereferenced to the function that actually deals with the command and its
options and arguments.  Here, we have actually two vectors of \C{KEYWORD}s,
because some statements might be stated in abbreviated form. The function
\C{findcommand()} deals with the search. \C{CompileStatement()} does some small
extra work, like for example checking the correct order of statements. In our
case, it calls the function \C{CoSymbol()}. This functions is in file
\C{name.c}, because as a declaration it basically adds something to the name
administration. Functions for other statements can be found in \C{compcomm.c}
and \C{compexpr.c}.

\C{CoSymbol()} loops over the arguments and adds proper variable names together
with their options to the symbols list \C{AC.Symbols} and the name
administration (in the call to \C{AddSymbol()}.  In our case, we have \C{x},
\C{y}, and \C{z} added. We have already encountered the basic mechanism of how a
specific struct is added to a \C{LIST}. The name administration was not
explained before, though.

Symbols can appear in expressions that need to be encoded. The coding for
symbols can simply be its entry index in the list \C{AC.Symbols}, but symbols
also need to be recognized when an expression is parsed. Therefore a efficient
look-up mechnism is required. This is achieved by a second data structure that
holds the name strings in a tree for fast searching. The data in the symbol list
does not contain the name string itself, but contains a referece (a index) into
this name string tree. The tree is managed by generalized functions and types
that are also used for other, similiar objects like vectors, indices, etc. The
functions for name trees are located in the first part of the file \C{name.c}.
The types \C{NAMENODE} and \C{NAMETREE} are defined in \C{structs.h}.
\C{NAMENODE}s are the node of a balanced binary tree. It does not hold the
name string just an index into \C{NAMETREE}. The actual data is contained in 
\C{NAMETREE} that constitute one tree. This type has buffers for the nodes and
for the name strings. This has the benefit of avoiding small malloc calls for
individual nodes. Also, since all referencing is done via offsets into these
buffers, a relocation or serialization of such a tree is very easy. In the
struct \C{C\_const} (aka the global \C{AC}) several name trees are defined, for
dollar variables, expressions, etc. The symbols added in our example program go
into the nametree referenced by \C{AC.activenames}, which is at this point equal
to \C{AC.varnames}.

Our program returns to the \C{PreProcessor()} and starts parsing the next lines:

\begin{verbatim}
 5      L	f = (x+y)^2 - (x+z)^`N';
 6      L	g = f - x;
\end{verbatim}

This time the function \C{DoExpr()} will get called (via \C{CoLocal()}) for each
line to do the parsing.  The function \C{DoExpr()} first tries to figure out
what type of \C{Local} statement we have. In our cases we have an actual
assignment. With the call to \C{GetVar()} we check whether a variable of the same
name already exists. The search is done in the nametrees \C{AC.varnames} and
\C{AC.exprnames}. Since our names are new we don't find a previous variable and
simply call \C{EntVar()}. \C{EntVar()} creates an entry in \C{AC.ExpressionList}
and puts the name into the \C{AC.exprnames} nametree. The entry in
\C{AC.ExpressionList} is of type \C{struct ExPrEsSiOn}. There are more struct
elements than in the case of symbols, but the principle is the same. Up to now,
the right-hand-side (RHS) has not been looked at and therefore no information
about it is saved in the expression's entry yet. The connection between the
expression's entry in the \C{AC.ExpressionList} and the data containing the RHS
will be made via the elements \C{prototype} and \C{onfile} as we will describe
soon.  The access to elements in \C{AC.ExpressionList} is facilitated by the
macro \C{Expressions}. The following code in \C{DoExpr()} builds up a so-called
prototype and puts the RHS in encoded form into the buffer system via the call
to \C{CompileAlgebra()}.

\FORM\ uses the allocated memory in \C{AT.WorkSpace} for operations like the
generation of terms. This memory stores \C{WORD}s and is used in a stack-like
fashion with the help of the pointer \C{AT.WorkPointer}. A function can write to
this memory and set \C{AT.WorkPointer} beyond the written data to insure that
other functions that are called and might use the workspace as well do not
overwrite this data. It is the responsibility of the function to reset
\C{AT.WorkPointer} to its original value again (see variable \C{OldWork} in our
case). Every thread in \TFORM\ will have its own private work space.

\FORM\ now uses \C{AT.WorkSpace} to build up a data structure that contains
everything that needs to be known at a later stage about the expression that is
parsed. The creation and the layout of the data is quite typical. First comes a
header that signifies what is coming. Here, it is \C{TYPEEXPRESSION}. Then comes
the length of the whole data, i.e. the total number of occupied \C{WORD}s. The
actual contents is following, which is a so-called subexpression that we will
discuss soon. The contents is followed by a coefficient and a zero, which
signifies the end of the data.

{\bf Coefficients} are coded in \FORM\ always in the following manner: Since
coefficients can in general be fractional numbers, we encode an integer
numerator and an integer denominator. The integers can have arbitrary length
(limited only by the buffer sizes, see the setup variables \C{MaxNumberSize} and
\C{MaxTermSize}) and are encoded in \C{WORD}-pieces in little-endian convention.
The number of allocated \C{WORD}s is always the same for the numerator and the
denominator. The last word of the coefficient contains the size of the whole
coefficient in words. The formal structure of a coefficients is therefore like
this:
\begin{center}
{\it NUMERATOR WORDS, DENOMINATOR WORDS, LENGTH}.
\end{center}
The integers are always
unsigned, i.e. positive. Negative fractions are encoded by a negative length.
Examples (with 16bit words): $2^{16}+2 = 65538$ gives words 2,1,1,0,5 and $-5/2$
gives $5,2,-3$.

The data structure in \C{AT.WorkSpace} is basically an instruction for the
generator, a central function that does the main work during the execution of
the \FORM\ program, to generate an expression. The content of the expression is
a subexpression. This is a pointer to the real content of the expression and
will be substituted later after the execution. The main reason for this delayed
expression insertion is that it can often save a lot of intermediate operations
and data space and thereby speed up \FORM. A case where such a thing can happen
is, when an expression is used at different places and the different parts are
brought together by some operations. Then, cancellations may occur or terms can
be factored out and when the expressions finally is inserted the workload is
less.

	In our example run, the data that will later instruct the generator to 
create an expression looks in total like this:

\begin{center}
{\it TYPEEXPRESSION, SUBEXPSIZE+3, 9, SUBEXPRESSION, \\
SUBEXPSIZE, 0, 1, AC.cbufnum, 1, 1, 3, 0}
\end{center}

We used the macro names as in the actual code. \C{AC.cbufnum} is a variable that
is the index of the compile buffer used for this parsed statement.
At the end of the data preparation phase the pointer \C{AT.WorkPointer} is set
beyond the data on the trailing zero, the pointer \C{AT.ProtoType}, which is
used soon in following functions is set to the word \C{SUBEXPRESSION}.

The expression will be put into the scratch buffer system. This system comprises
the small and large buffers and the scratch files. Where new data to the scratch
buffers will be stored is of no concern to a function like \C{DoExpr()}, it
simply uses several utility functions for that purpose. Still, we need to
initialize the variable \C{pos} here that will indicate the position of the
data, i.e. the expression, in the scratch file.

Next, the function \C{CompileAlgebra()} is called to parse the right hand side
and put the codified expression into the \FORM\ buffers. It basically calls two
functions: \C{tokenize} and \C{CompileSubExpressions}. \C{tokenize} is the
tokenizer that translates the input character string in a sanitized and partly
interpreted string of codes. It will look up the variables named in the input
string and put the index they have in the name administration into the tokenized
output. Our input string is transformed into the code string like this

\begin{verbatim}
   (     -13  LPARENTHESIS
   x      -1  TSYMBOL
           5
   +     -26  TPLUS
   y      -1  TSYMBOL
           6
   )     -14  RPARENTHESIS
   ^     -25  TPOWER
   2      -8  TNUMBER
           2
   -     -27  TMINUS
   (     -13  LPARENTHESIS
   x      -1  TSYMBOL
           5
   +     -26  TPLUS
   z      -1  TSYMBOL
           7
   )     -14  RPARENTHESIS
   ^     -25  TPOWER
   `N'    -8  TNUMBER
           3
         -29  TENDOFIT
\end{verbatim}

This code string then lies in the \C{AC.tokens} buffer where it is used by
subsequent functions.

The function \C{CompileSubExpression()} finds terms in an expression that might
be reused at another place and extracts them. As one can see in the code, the
function looks for terms in parentheses and works recursively. The end of such a
term is each time marked with \C{TENDOFIT}. Then, the function
\C{CodeGenerator()} called at the end of \C{CompileSubExpression()} does the
real work.

In our example \C{CodeGenerator()} first gets the data
\begin{center}
{\it LPARENTHESIS, TSYMBOL, 5, TPLUS, TSYMBOL, 6, TENDOFIT}
\end{center}
as a parameter, which is the term $x+y$. It builds up the actual term encoding
in the workspace and first reserves for that enough space there. One can see the
pointer arithmetic using constants like \C{AM.MaxTal}, which is the maximum
number of words a number can occupy. It reserves space for the coefficient, an
integer, and the actual term. Once a token is recognized, the equivalent term
data is written to the workspace and the function \C{CompleteTerm} is called.
This function completes the data to
\begin{center}
{\it 8, 1, 4, 5, 1, 1, 1, 3, 0}. 
\end{center}
The first word is the total length, i.e. 8 words. This is the length of the
whole expression. The second word is the type of the term, which is a symbol. It
is the value \C{SYMBOL} as defined in \C{ftypes.h}. This macro definition
\C{SYMBOL} has the value 1 (in the \FORM\ version at this time this reference is
written). Following the type signifying word is the length of the term, which is
4. Several such terms could follow each other, but we only have one term at the
moment. Finally, we have the trailing words for the coefficient being 1 and a
terminating zero. The meaning and interpretation of the words in the data of a
single term after the type word and the length word are dependent on the type.
For symbols, we have pairs of word, where the first word is the index of the
symbol in the name administration and the second word is the exponent. Here we
have symbol 5 ($ = x$) with an exponent 1.  After \C{CompleteTerm()} has
constructed the whole expression it copies the data to the compile buffers with
the help of the function \C{AddNtoC()}.

The compile buffers contain the instruction for the execution engine, the
\C{Processor()}, that will start when the \C{.sort} command is parsed. Our terms
are put into the right-hand-side buffers in the compile buffer. When the
\C{Processor()} will read these buffers one after the other, it will take the
terms and put them into the scratch buffer system. Then, they become the
expressions upon which further statements do act. The compile buffers are stored
in the list \C{AC.cbufList} and we get access to the elements via the cast
\C{((CBUF *)(AC.cbufList.lijst))}. This cast is defined as a preprocessor macro
called \C{cbuf}. The element \C{cbuf[0]->numrhs} (0 is the current compile
buffer we are using) gives the number of entries in \C{cbuf[0]->rhs}, which is
an array of pointer into \C{cbuf[0]->Buffer}. We have 3 elements:

\begin{verbatim}
  cbuf[0]->rhs[1]   -->
     8, 1, 4, 5, 1, 1, 1, 3, 8, 1, 4, 6, 1, 1, 1, 3, 0
  cbuf[0]->rhs[2]   -->
     8, 1, 4, 5, 1, 1, 1, 3, 8, 1, 4, 7, 1, 1, 1, 3, 0
  cbuf[0]->rhs[3]   -->
     9, 6, 5, 1, 2, 0, 1, 1, 3, 9, 6, 5, 2, 3, 0, 1, 1, -3, 0
\end{verbatim}

\C{cbuf[0]->rhs[0]} is not used and the data lies consecutively in \\
\C{cbuf[0]->Buffer}. The meaning of the first two entries has already been
explained. These are expressions containing $x+y$ and $x+z$, respectively.
The last expression uses subexpressions that have the type \C{SUBEXPRESSION}
$ = 6$. The length of a subexpression is 5 and the contents $1,2,0$ means
that expression 1 needs to be inserted with an exponent of 2. The zero is a
dirty flag that signals to the processor the state of the subexpression. Here in
the compile buffers it is simply cleared to zero. The contents $2,3,0$ of the
second subexpression should be obvious. Finally, we have an negative
coefficient for the second subexpression which accounts for the minus sign
between the parentheses in our original expression.

We return to the function \C{DoExpr()} where the prototype of the expression is
put into the scratch system via the call \C{PutOut()} and we are finished with
this line in the input file. The next line defining a second local expression
works the same.

We come to the parsing of the following statements:

\begin{verbatim}
 7      
 8      Brackets x;
 9      Print;
\end{verbatim}

The bracket statement is dealt with in function \C{DoBrackets()}. It sets the
flag \C{AR.BracketOn} to 1 and constructs the term that will stand outside the
bracket. This term is copied into the \C{AT.BrackBuf} buffer, where it can be
used by the execution engine when it needs to insert this heading term into an
expression.

The print statement is parsed in function \C{DoPrint()}. Since we don't have any
arguments to \C{Print} all active expressions shall be printed.
\C{DoPrint()} just loops through the \C{Expressions} list and sets the
\C{printflag} to 1 for each expression.

With the next statement in our input file

\begin{verbatim}
10      .sort
\end{verbatim}

we will get to know the other central parts of \FORM: the processor and the
sorting routines. The code in the \C{PreProcessor()} will call
\C{ExecModule()}
which calls \C{DoExecute()}. We can ignore a lot of code there that is only for
parallelized versions of \FORM. There are three important functions calls
happening. First, \C{RevertScratch()} is called. \FORM\ uses three scratch
buffers: input buffer, output buffer, and the hide buffer. The usual mode of
operation is to apply statements on expressions in the input buffer, sort and
normalize the result, and write it into the output buffer. This repeats for
every executing module and therefore an important optimization is made: the
input buffer and the output buffer simply change their roles.
\C{RevertScratch()} does this job.  The second and third important calls are to
\C{Processor()} and \C{WriteAll()}. 

\C{Processor()} is, as the name suggests, the main processor that executes
statements and deals with the results. A lot of initialization work is done
before we go into the large loop over the expressions that spans almost the
whole function. Our expressions have as regular expressions from the scratch
buffers the \C{inmem} flag set to zero, so we go into the else branch of the
checking if-clause. There we go to the case of a \C{LOCALEXPRESSION}. The main
logic here is to do a single call to \C{GetTerm()} to get the first term from
the input file and copy that to the output with the call to \C{PutOut()}. This
first term, which is a subexpression, serves as a header for the expression. It
follows a (while-)loop that calls \C{GetTerm()}, and if there are still terms,
the loop executes its body and calls \C{Generator()}. After this loop, some
clean-up and a final \C{EndSort()} is done, before the outer loop over the
expressions repeats. \C{Generator()} is the function where the read input, which
is {\it 9, 6, 5, 3, 1, 0, 1, 1, 3}, will be substituted and expanded.

\C{Generator()} gets the term in the workspace and first tries to do all
substitutions (\C{SUBEXPRESSION}), then applies the statements in the compile
buffers to the normalized terms, substitutes again if necessary, do brackets,
and finally sorts the result.

The call to \C{TestSub()} does the search for subexpressions. \C{TestSub()} will
find a subexpression in our case and return the number (3) of this subexpression
and set other global variables ready for the following steps. In \C{Generator()}
we enter therefore the if-clause checking \C{replac}$> 0$.  Depending on the
power of the subexpression different operations are taken. We have our
subexpression to the power one only, which is an easy case. The actual
substitution is performed by the function \C{InsertTerm()}. Since the new term
might again contain subexpressions we do a recursive call to \C{Generator()}.
Our expression contains several layers of subexpressions which are all dealt
with as described above. Only the powers of the other subexpressions are
different from one, so we get slightly more work to be done which involves the
expansion of the terms using binomials. 

Finally, the call to \C{TestSub()} at the beginning of \C{Generator()} will
return zero. The function \C{Normalize()} is called, which puts the terms in a
canonical form, i.e. terms are ordered and collected with the correct
coefficient. In our example, as the first fully subsituted term we have
{\it 12, 1, 4, 6, 1, 1, 4, 6, 1, 1, 1, 3} before the call to
\C{Normalize()}, which means we have a term $x*x$. \C{Normalize()}
makes this into {\it 8, 1, 4, 6, 2, 1, 1, 3}, which is $x^2$.

Then, we loop over the statements in the compile buffer. \C{level} is the
instruction counter. We have a long switch-clause that interprets the statement
type identifiers like \C{TYPECOUNT}. Statements with \C{TYPEEXPRESSION} are not
treated here. So we loop over all the compile buffer statements here and only
call \C{TestMatch()} at the loop's end. This function has no effect in our
example, because we have no pattern matching going on.

Then, the function \C{PutBracket()} is called to deal with brackets. Brackets
are implemented by putting the special code \C{HAAKJE} inside the expression.
The terms before the \C{HAAKJE} are outside the bracket, everything following it
will be inside the bracket. 

At the end of the loop over the terms in the expressions, the function
\C{StoreTerm()} is called. This function puts the result of the processing in
the output scratch buffers. Finally, we return to \C{Processor()}. There the
final sorting is started. Also, the printing of the expressions is done here.

The parsing in \C{PreProcessor()} continues with

\begin{verbatim}
11      
12      #do i=2,3
13      Id	x?^`i' = x;
14      #enddo
\end{verbatim}

Here we have a somewhat more complicated example of preprocessor instructions.
The do-loop is treated in \C{DoDo()} which sets up data structures (\C{DOLOOP})
to guide the preprocessor when it is parsing the loop body. The statement line
will then be presented to the compiler two times and with the correct values of
the preprocessor variable \C{i}. The compiler deals with this statement in
\C{CoId()} which is just calling \C{CoIdExpression()}.  \C{CoIdExpression()}
puts a \C{TYPEIDNEW} code into the lhs compile buffer. This tells the processor
later how to do the pattern matching. The rhs is the term \C{x} that will be
inserted.

The parsing continues and ends with

\begin{verbatim}
15      
16      Print +s;
17      .end
\end{verbatim}

The way these statements are treated and how the program is executed has already
been described. The pattern matching is something that has not occurred before,
though. We will not describe it here, since there is a dedicated section in this
manual for that. After the final sorting, \FORM\ will clean up tempory files and
other resources that are not automatically freed by the operating system before
\FORM\ ends itself.