1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244
|
\section{Instruction Selection} \label{sec:instrsel}
Instruction selection modules are reponsible for translating
\href{mltree.html}{MLTree} statements into a flowgraph consisting
of target machine instructions. MLRISC decomposes this complex task
into \emph{three} components:
\begin{description}
\item[Instruction selection modules] which are responsible for
mapping a sequence of MLTree statements into a sequence target machine code,
\item[Flowgraph builders] which are responsible for constructing
the graph representation of the program from a sequence of target machine
instructions, and
\item[Client Extender] which are responsible for compiling
MLTree extensions (see also Section~\ref{sec:mltree-extension}).
\end{description}
By detaching these components, extra flexiblity is obtained. For example,
the MLRISC system uses two different internal representations. The
first, \href{cluster.html}{cluster}, is a \emph{light-weight} representation
which is suitable for simple compilers without extensive
optimizations; the second, \href{mlrisc-ir.html}{MLRISC IR}, is a
\emph{heavy duty} representation which allows very complex transformations
to be performed. Since the flowgraph builders are detached from the
instruction selection modules, the same instruction selection modules
can be used for both representations.
For consistency, the three components communicate to each other
via the same \href{stream.html}{stream} interface.
\subsection{Interface Definition}
All instruction selection modules satisfy the following signature:
\begin{SML}
signature \mlrischref{mltree/mltreecomp.sig}{MLTREECOMP} =
sig
structure T : \href{mltree.html}{MLTREE}
structure I : \href{instructions.html}{INSTRUCTIONS}
structure C : \href{cells.html}{CELLS}
sharing T.LabelExp = I.\href{labelexp.html}{LabelExp}
sharing I.C = C
type instrStream = (I.instruction,C.regmap,C.cellset) T.stream
type mltreeStream = (T.stm,C.regmap,T.mlrisc list) T.stream
val selectInstructions : instrStream -> mltreeStream
end
\end{SML}
Intuitively, this signature states that
the instruction selection module
returns a function that can transform a stream of MLTree statements
(\newtype{mltreeStream}) into a stream of instructions of the target
machine (\newtype{instrStream}).
\subsubsection{Compiling Client Extensions}
Compilation of client extensions to MLTREE is controlled by the
following signature:
\begin{SML}
signature \mlrischref{mltree/mltreecomp.sig}{MLTREE_EXTENSION_COMP} =
sig
structure T : \href{mltree.html}{MLTREE}
structure I : \href{instructions.html}{INSTRUCTIONS}
structure C : \href{cells.html}{CELLS}
sharing T.LabelExp = I.\href{labelexp.html}{LabelExp}
sharing I.C = C
type reducer =
(I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer
val compileSext : reducer -> \{stm:T.sext, an:T.an list\} -> unit
val compileRext : reducer -> \{e:T.ty * T.rext, rd:C.cell, an:T.an list\} -> unit
val compileFext : reducer -> \{e:T.ty * T.fext, fd:C.cell, an:T.an list\} -> unit
val compileCCext : reducer -> \{e:T.ty * T.ccext, ccd:C.cell, an:T.an list\} -> unit
end
\end{SML}
Methods \verb|compileSext|, \verb|compileRext|, etc.~are callbacks that
are responsible for compiling MLTREE extensions. The arguments
to these callbacks have the following meaning:
\begin{description}
\item[reducer] The first argument is always the \newtype{reducer},
which contains internal methods for translating MLTree constructs
into machine code. These methods are supplied by the instruction
selection modules.
\item[an] This is a list of annotations that should be attached to the
generated code.
\item[ty, fty] These are the types of the extension construct.
\item[stm, e] This are the extension statement and expression.
\item[rd, fd, cd] These are the target registers of the
expression extension, i.e.~the callback should generate the appropriate
code for the expression and writes the result to this target.
\end{description}
For example, when an instruction selection encounters a
\verb|FOR(|$i,a,b,s$\verb|)| statement extension
defined in Section~\ref{sec:mltree-extension}, the callback
\begin{SML}
compileStm reducer \{ stm=FOR(\(i,a,b,s\)), an=an \}
\end{SML}
\noindent is be involved.
The \newtype{reducer} type is defined
in the signature \mlrischref{mltree/mltree.sig}{MLTREE} and has the
following type:
\begin{SML}
datatype reducer =
REDUCER of
\{ reduceRexp : rexp -> reg,
reduceFexp : fexp -> reg,
reduceCCexp : ccexp -> reg,
reduceStm : stm * an list -> unit,
operand : rexp -> I.operand,
reduceOperand : I.operand -> reg,
addressOf : rexp -> I.addressing_mode,
emit : I.instruction * an list -> unit,
instrStream : (I.instruction,C.regmap,C.cellset) stream,
mltreeStream : (stm,C.regmap,mlrisc list) stream
\}
\end{SML}
The components of the reducer are
\begin{description}
\item[reduceRexp, reduceFexp, reduceCCexp] These functions
take an expression of type integer, floating point and condition code,
translate them into machine code and return the
register that holds the result.
\item[reduceStm] This function takes an MLTree statement and translates
it into machine code. it also takes a second argument, which is the
list of annotations that should be attached to the statement.
\item[operand] This function translates an \sml{rexp} into an
\sml{operand} of the machine architecture.
\item[reduceOperand] This function takes an operand of the machine
architecture and reduces it into an integer register.
\item[addressOf] This function takes an \sml{rexp}, treats
it as an address expression and translates it into the appropriate
\sml{addresssing_mode} of the target architecture.
\item[emit] This function emits an instruction with attached annotations
to the instruction stream
\item[instrStream, mltreeStream] These are the instruction stream
and mltree streams that are currently bound to the extender.
\end{description}
\subsubsection{Extension Example}
Here is an example of how the extender mechanism can be used,
using the \sml{DSP_MLTREE} extensions defined in
Section~\ref{sec:mltree-extension}. We need supply two
new functions, \verb|compileDSPStm| for compiling the \verb|FOR|
construct, and \verb|compileDSPRexp| for compiling the \verb|SUM|,
and saturated arithmetic instructions.
The first function, \sml{compileDSPStm}, is generic and simply
translates the \verb|FOR| loop into the appropriate branches.
Basically, we will translate \verb|FOR(|$i,start,stop,body$\verb|)| into
the following loop in pseudo code:
\begin{SML}
limit = \(stop\)
\(i\) = \(start\)
goto test
loop: \(body\)
\(i\) = \(i\) + 1
test: if \(i\) <= limit goto loop
\end{SML}
This transformation can be implemented as follows:
\begin{SML}
functor DSPMLTreeExtensionComp
(structure I : DSP_INSTRUCTION_SET
structure T : DSP_MLTREE
sharing I.LabelExp = T.LabelExp
) =
struct
structure I = I
structure T = T
structure C = I.C
type reducer =
(I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer
fun mark(s, []) = s
| mark(s, a::an) = mark(ANNOTATION(s, a), an)
fun compileSext (REDUCER\{reduceStm, ...\})
\{stm=FOR(i, start, stop, body), an\} =
let val limit = C.newReg()
val loop = Label.newLabel ""
val test = Label.newLabel ""
in reduceStm(
SEQ[MV(32, i, start),
MV(32, limit, stop),
JMP([], [LABEL(LabelExp.LABEL test)], []),
LABEL loop,
body,
MV(32, i, ADD(32, REG(32, i), LI 1),
LABEL test,
mark(BCC([],
CMP(32, LE, REG(32, i), REG(32, limit)),
loop),
an),
]
)
end
...
\end{SML}
In this transformation, we have chosen to proprogate the annotation
\verb|an| into the branch constructor.
Assuming the target architecture that we are translated into contains
saturated arithmetic instructions \verb|SADD|, \verb|SSUB|, \verb|SMUL|
and \verb|SDIV|, the DSP extensions
\verb|SUM| and saturated arithmetic expressions can be handled as follows.
\begin{SML}
fun compileRext (REDUCER\{reduceStm, reduceRexp, emit, ...\})
\{ty, e, rd, an\} =
(case (ty,e) of
(_,T.SUM(i, a, b, exp)) =>
reduceStm(SEQ[MV(ty, rd, LI 0),
FOR(i, a, b,
mark(MV(ty, rd, ADD(ty, REG(ty, rd), exp)), an))
]
)
| (32,T.SADD(x,y)) => emit(I.SADD\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
| (32,T.SSUB(x,y)) => emit(I.SSUB\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
| (32,T.SMUL(x,y)) => emit(I.SMUL\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
| (32,T.SDIV(x,y)) => emit(I.SDIV\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
| ...
)
fun compileFext _ _ = ()
fun compileCCext _ _ = ()
end
\end{SML}
Note that in this example, we have simply chosen to reduce
a \verb|SUM| expression into the high level \verb|FOR| construct.
Clearly, other translation schemes are possible.
\subsection{Instruction Selection Modules}
Here are the actual code for the various back ends:
\begin{enumerate}
\item \mlrischref{sparc/mltree/sparc.sml}{Sparc}
\item \mlrischref{hppa/mltree/hppa.sml}{PA-RISC}
\item \mlrischref{alpha/mltree/alpha.sml}{Alpha}
\item \mlrischref{ppc/mltree/ppc.sml}{Power PC}
\item \mlrischref{x86/mltree/x86.sml}{X86}
\item C6xx
\end{enumerate}
|