1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151
|
#+TITLE: How to add a missing expression operator
Sometimes you may wish to create an expression JIT template using an
operator that doesn't yet exist. Expression operators are meant to
represent CPU-level constructs, but the current list is not complete,
and will likely not be in the near future. So sometimes it is
necessary to add a new expression operator. This guide will help you
through the process.
* Add an expression operator defintion
The file =src/jit/expr_ops.h= contains a macro-list of all defined operators.
Each operator is defined as such:
#+BEGIN_EXAMPLE
_(ADD, 2, 0), \
#+END_EXAMPLE
+ First item is the name, which must be *uppercase*. Choose something
that makes sense.
+ Second item is the number of *children* (or operands or
arguemnts). So e.g. =ADD= has two input arguments, =NOT= has only
one. Operands are other tree nodes.
+ Third item is the number of *parameters* (or options). Parameters
are constants to use for the operator. E.g. for the =CONST= operator
the value of the constant is a parameter.
Mind the comma and the backslash to escape the newline.
* Add to operator type tables
The template precompiler uses type tables to check the consistency of
templates while compiling. You may need to edit these tables when
adding a new operator.
- =%OPERATOR_TYPES= contains the result type of the operator. E.g. an
operator like =store= does not yield any value, and its
=%OPERAND_TYPE= is =void=.
- =%OPERAND_TYPES= maps operators to their expected operand types as
comma-separated values. If the number of types does not match up
with the number of operands, the first type is repeated to match,
and the last type is treated as an exception. E.g. =do= maps to
=void,reg= - meaning that it accepts any number of =void= operands
and a final =reg= operand.
- =%OP_SIZE_ARG= records the argument offset (if any) that represents
the 'size' applicable to the operator, allowing the template
compiler to check if that argument makes any sense. (As defined by -
a parameter that ends with =_sz=, or a macro expression).
The default type is =reg= (an integer or pointer value that can be
stored in a *register*) and it is not necessary to specify this
explicitly.
* Add to autocasting in the analyzer
Templates are generally written without concern for the exact sizes
and offsets at runtime. (Those depend on the C compiler). As a
result, an operator may get differently-sized operands. The expression
JIT resolves such size conflicts by insertings size casts
automatically. Operand casting can be in three forms, and you'll need
to decide which it is:
- =NO_CAST= - don't insert casts
- =UNSIGNED= - extend with zeros
- =SIGNED= - extend with sign bit (two's complement)
Typically, =SIGNED= is only suitable for arithmetic operators,
=UNSIGNED= is more suitable for binary operators like =AND= and =OR=,
and anything else should probably use =NO_CAST=. There's a long switch
statement in =expr.c= (=analyze_node=) that sets this up, and thats
where you add your edits.
* Add tiles to implement the new operator
At the moment we only support x86-64 - if more architectures are ever
added, this applies to those architectures as well.
A tile maps a subtree of operators to machine code. (The smallest
subtree is obviously just one node). Mapping the tree to a (hopefully
optimal) set of tiles is the job of the tiler. We need to add both a tile
template and tile function to allow be able to compile our new operator.
** Tile Functions
Tile functions are defined in =src/jit/x64/tiles.dasc=. (This file is
included by DynASM from =src/jit/x64/emit.dasc=, because that's
necessary to create a single list of 'actions' for DynASM to operate
on). A tile function looks like:
#+BEGIN_SRC c
MVM_JIT_TILE_DECL(add_reg) {
MVMint8 reg[2];
ensure_two_operand_pre(tc, compiler, tile, reg);
| add Rq(reg[0]), Rq(reg[1]);
ensure_two_operand_post(tc, compiler, tile, reg);
}
#+END_SRC
The =MVM_JIT_TILE_DECL= macro expands to a tile function declaration
(which is assigned to a tile *object* in the tile *list*, which is
what we use to represent the compiled code just prior to emitting it
to the DynASM buffer).
For x64, common tile functions are:
+ direct, e.g. =shl reg, reg=
+ constant: =shl reg, constant=
+ from memory: =shl reg, [memory address]=
But implementing at least the direct function is enough, for the
moment.
The =ensure_to_operand_pre= and =ensure_two_operand_post= are
necessary because x86 is a two-operand language that destructively
updates one operand, like =a <- operator(a, b)=. Whereas the
expression language is a result-arguments language that doesn't update
values, like =c <- operator(a,b)=. It is my intention to make a
higher-level component (the register allocator or a
post-register-allocator phase) take care of this, but I haven't
figured out how to yet.
** Tile templates
A tile template (or tile pattern) is written much like an expression
template. Tile templates are architecture specific. They are declared
(for x64) in =src/jit/x64/tile_pattern.tile=. Tiles map to subtrees
and look like these:
#+BEGIN_EXAMPLE
(tile: add_reg (add reg reg) reg 2)
(tile: add_const (add reg (const $val $size)) reg 3)
(tile: add_load_addr (add reg (load (addr reg $ofs) $sz)) reg 6)
(tile: add_load_idx (add reg (load (idx reg reg $scale) $size)) reg 6)
#+END_EXAMPLE
Each tile pattern reduces a subtree (second item) to a symbol (third
item). The subtree pattern consists of operators and symbols, and each
symbol is a placeholder for another tile that may be placed there.
First item (after tile:) is the tile name and must match with the tile
function declared in =tiles.dasc= The last item is (an estimate) of
the cost. Don't worry to much about the cost - that too needs a
rework.
* Write your expression template
That's it! If you've added all these things, you're good to go - your
new operator can be used in an expression template.
|