1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
|
# On adding features to languages
## First of all
Adding a feature has a very high development cost, mainly due to the fact that this new feature induces a break in the versions: there will be the next code that will not be compilable by the previous language (which can be even worse if your feature doesn't ensure backward compatibility: the break occurs in both directions). In this respect, it may be useful to write a compatibility script. Also, the first question to ask yourself when adding a feature is:
- is the feature you are about to add really necessary? In other words: Is it useful only in a negligible number of cases? Is it understandable only by the few gurus who wrote the language? Can it simply be emulated directly in the language (or in the compiler)? Will it be difficult to maintain?
- isn't the code base too unstable to afford a new potential source of bugs? If you hesitated to answer no to any of these questions, you should close this file and reconsider adding a feature.
With that said, let's get down to business.
## Practical example
The problem considered here is the addition of a signal interval annotation system in the Faust language, allowing to indicate and test compiler-calculated major/minorities. This language (like all DSLs) is certainly not a very good example for the general case, however, some techniques should be quite similar.
We propose to add two pairs of primitives to Faust:
* `highest(s)` and `lowest(s)`, single-input signals returning respectively the major and minor of the signal `s` computed by the compiler. From the point of view of the Faust type lattice, they are constants, computable at compile time (of course), floating point (so not boolean), parallelizable (I think), and this, whatever the input signal.
* `assertbounds(lo, hi, sig)`, with `lo` and `hi` two constants known at compile time, which will have two behaviors: in normal mode, it creates a signal whose value is that of `sig` but whose interval is `[lo, hi]`, in debug mode, it checks during execution that this interval is indeed checked.
N.B. Two other primitives concerning the resolution of signals should probably be added.
## Lexer/Parser
The first step is to extend the set of valid Faust programs so that they contain (preferably exactly) the strings representing these primitives, by modifying the syntax (lexing) and semantics (parsing) of Faust. Indeed, if we ask Faust to compile the following program:
process = assertbounds(-1, 1);
we get the following error:
1 : ERROR : undefined symbol : assertbounds
To do this, we have to modify the files describing the lexer and the parser, here written in Flex/Bison (for more details, see the Dragon Book)
In Faust, these files are located in `compiler/parser`.
### Lexing
In Bison, the declaration of tokens is done in the parser, here `faustparser.y`, so let's declare 4 new tokens:
%token ASSERTBOUNDS
%token LOWEST
%token HIGHEST
Then we need to associate these tokens with Faust strings, so let's modify the lexer. The file containing the lexer is in `faustlexer.l`. Just add (between the two `%%`) :
"assertbounds" return ASSERTBOUNDS;
"lowest" return LOWEST;
"highest" return HIGHEST;
Let's recompile the parser and the compiler (`make parser` and `make` at the root). Compiling the example :
process = assertbounds(-1, 1);
we now get:
1 : ERROR : syntax error
### Parsing
Since we are going to add a primitive, it will logically come from the non-terminal `primitive`. Since we are dealing with primitives, our tokens will be terminal.
Warning: we are talking about the implementation of a new primitive, not an `xtended` one.
In most compilers, for a primitive of arity `n`, it is usual to ask the parser to build an object symbolizing the primitive by calling the `C++` constructor of the primitive with the result of parsing the arguments of the primitive. However, since the Faust language describes a block algebra, the arguments are not always explicitly passed to the primitive, they can be routed. Hence a small _wrapper_ around the constructor of the primitive, depending on the airty of the primitive.
As they have an arity of 1 and 3, two new boxes must be defined:
| ASSERTBOUNDS { $$ = boxPrim3(sigAssertBounds);}
| LOWEST { $$ = boxPrim1(sigLowest);}
| HIGHEST { $$ = boxPrim1(sigHighest);}
the parsing is finished.
## Signal constructors
Faust constructors are extremely classical from a computational point of view: each signal is a tree with in its root a symbol unique to the operation of the signal (e.g. an ADD symbol for an addition) and as children its arguments (boilerplate incoming code).
They are defined in `signals.cpp` with their destructors (in the functional programming sense of the term, not memory management).
But first we have to define the symbols of the signals. For optimization reasons, they are defined once and for all in `global.hh` and `global.cpp` (a single mutable object called `gGlobal` which emulates the global variables to the whole code):
`global.hh`
```c++
Sym SIGASSERTBOUNDS;
Sym SIGHIGHEST;
Sym SIGLOWEST;
```
`global.cpp`
```c++
SIGASSERTBOUNDS = symbol("sigAssertBounds");
SIGHIGHEST = symbol("sigHighest");
SIGLOWEST = symbol("sigLowest");
```
We can now define the constructor of the signal `assertbounds` as well as its destructor :
`signals.hh`
```c++
Tree sigAssertBounds(Tree s1, Tree s2, Tree s3);
Tree sigLowest(Tree s);
Tree sigHighest(Tree s);
bool isSigAssertBounds(Tree t, Tree& s1, Tree& s2, Tree& s3);
bool isSigLowest(Tree t, Tree& s);
bool isSigHighest(Tree t, Tree& s);
```
`signals.cpp`
```c++
Tree sigAssertBounds(Tree s1, Tree s2, Tree s3)
{
return tree(gGlobal->SIGASSERTBOUNDS, s1, s2, s3);
}
bool isSigAssertBounds(Tree t, Tree& s1, Tree& s2, Tree& s3)
{
return isTree(t, gGlobal->SIGASSERTBOUNDS, s1, s2, s3);
}
```
And now it compiles, unfortunately. Indeed, if the compiler is now able to create an object for the primitives, it still has no idea how to compile it. And indeed, if we compile the example:
`process = assertbounds;`
We get:
ERROR: getSubSignals unrecognized signal: sigAssertBounds[-1,1,SigInput[0]]
## Compilation
Now that the signals are built, we can start compiling. Let's modify the `getSubSignals` function. This function is in the `subsignals.cpp` file (we can for example use the auto-generated documentation with Doxygen `make doc` at the root, or the definition search features in modern editors like Emacs), it just extracts the signals from the subtrees of the signal tree (let's not forget that a signal is stored as a tree). Our boxes all have two signals, so we can use the `sigPrefix` case as an example.
After compiling and executing the code, we get the new and more interesting error:
ERROR inferring signal type: unrecognized signal
So we have to modify the type inference system present in the `signals/sigtyperules.cpp` file. The formal definition of Faust types can be found as a comment in the header of the `sigtype.hh` file.
Let's start with `assertbounds`, the principle of this function being to add bounds to a signal, just use the `promoteInterval` method
other files to change for the backend -ocpp
- `signals/sigToGraph.cpp`, for signal graphs
- `signals/sigIdentity.cpp`, for signal graphs
- `boxes/ppbox.cpp`, for diagrams
- `generator/compile_scal.cpp`, this file is the one that contains the actual compilation
- `sigprint.cpp`, if you need a special drawing
Think about tests against the:
- generated code
- intervals
What happens with non-decimal constants in the F2D version?
What happens if the interval gives an exact version in double but there is an overflow in float (e.g. annoying delays)?
|