File: instrsel.tex

package info (click to toggle)
mlton 20210117%2Bdfsg-3
links: PTS, VCS
area: main
in suites: sid
size: 58,464 kB
sloc: ansic: 27,682; sh: 4,455; asm: 3,569; lisp: 2,879; makefile: 2,347; perl: 1,169; python: 191; pascal: 68; javascript: 7
file content (244 lines) | stat: -rw-r--r-- 9,682 bytes
parent folder | download | duplicates (5)
\section{Instruction Selection} \label{sec:instrsel}
Instruction selection modules are reponsible for translating 
\href{mltree.html}{MLTree} statements into a flowgraph consisting
of target machine instructions.  MLRISC decomposes this complex task 
into \emph{three} components:
\begin{description}
   \item[Instruction selection modules] which are responsible for
mapping a sequence of MLTree statements into a sequence target machine code,
   \item[Flowgraph builders]  which are responsible for constructing
the graph representation of the program from a sequence of target machine
instructions, and
   \item[Client Extender] which are responsible for compiling 
MLTree extensions (see also Section~\ref{sec:mltree-extension}).
\end{description}
By detaching these components, extra flexiblity is obtained.  For example,
the MLRISC system uses two different internal representations.  The
first, \href{cluster.html}{cluster}, is a \emph{light-weight} representation
which is suitable for simple compilers without extensive 
optimizations; the second, \href{mlrisc-ir.html}{MLRISC IR}, is a 
\emph{heavy duty} representation which allows very complex transformations
to be performed.  Since the flowgraph builders are detached from the
instruction selection modules, the same instruction selection modules
can be used for both representations.  

For consistency, the three components communicate to each other 
via the same \href{stream.html}{stream} interface.

\subsection{Interface Definition}
All instruction selection modules satisfy the following signature:

\begin{SML}
signature \mlrischref{mltree/mltreecomp.sig}{MLTREECOMP} = 
sig
   structure T : \href{mltree.html}{MLTREE}
   structure I : \href{instructions.html}{INSTRUCTIONS}
   structure C : \href{cells.html}{CELLS}
      sharing T.LabelExp = I.\href{labelexp.html}{LabelExp}
      sharing I.C = C

   type instrStream = (I.instruction,C.regmap,C.cellset) T.stream
   type mltreeStream = (T.stm,C.regmap,T.mlrisc list) T.stream

   val selectInstructions : instrStream -> mltreeStream
end
\end{SML}
Intuitively, this signature states that
the instruction selection module 
returns a function that can transform a stream of MLTree statements 
(\newtype{mltreeStream}) into a stream of instructions of the target 
machine (\newtype{instrStream}).  

\subsubsection{Compiling Client Extensions}

Compilation of client extensions to MLTREE is controlled by the
following signature: 
\begin{SML}
signature \mlrischref{mltree/mltreecomp.sig}{MLTREE_EXTENSION_COMP} =
sig
   structure T : \href{mltree.html}{MLTREE}
   structure I : \href{instructions.html}{INSTRUCTIONS}
   structure C : \href{cells.html}{CELLS}
      sharing T.LabelExp = I.\href{labelexp.html}{LabelExp}
      sharing I.C = C

   type reducer = 
     (I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer

   val compileSext : reducer -> \{stm:T.sext, an:T.an list\} -> unit
   val compileRext : reducer -> \{e:T.ty * T.rext, rd:C.cell, an:T.an list\} -> unit
   val compileFext : reducer -> \{e:T.ty * T.fext, fd:C.cell, an:T.an list\} -> unit
   val compileCCext : reducer -> \{e:T.ty * T.ccext, ccd:C.cell, an:T.an list\} -> unit
end
\end{SML}

Methods \verb|compileSext|, \verb|compileRext|, etc.~are callbacks that
are responsible for compiling MLTREE extensions.  The arguments
to these callbacks have the following meaning:
\begin{description}
  \item[reducer] The first argument is always the \newtype{reducer}, 
which contains internal methods for translating MLTree constructs
into machine code.  These methods are supplied by the instruction
selection modules.
  \item[an] This is a list of annotations that should be attached to the
generated code.
  \item[ty, fty] These are the types of the extension construct.
  \item[stm, e] This are the extension statement and expression.
  \item[rd, fd, cd] These are the target registers of the 
expression extension, i.e.~the callback should generate the appropriate
code for the expression and writes the result to this target.
\end{description}

For example, when an instruction selection encounters a
\verb|FOR(|$i,a,b,s$\verb|)| statement extension
defined in Section~\ref{sec:mltree-extension}, the callback
\begin{SML}   
  compileStm reducer \{ stm=FOR(\(i,a,b,s\)), an=an \}
\end{SML}
\noindent is be involved. 

The \newtype{reducer} type is defined
in the signature \mlrischref{mltree/mltree.sig}{MLTREE} and has the
following type:
\begin{SML}
  datatype reducer =
    REDUCER of
    \{ reduceRexp    : rexp -> reg,
      reduceFexp    : fexp -> reg,
      reduceCCexp   : ccexp -> reg,
      reduceStm     : stm * an list -> unit,
      operand       : rexp -> I.operand,
      reduceOperand : I.operand -> reg,
      addressOf     : rexp -> I.addressing_mode,
      emit          : I.instruction * an list -> unit,
      instrStream   : (I.instruction,C.regmap,C.cellset) stream,
      mltreeStream  : (stm,C.regmap,mlrisc list) stream
    \}
\end{SML}

The components of the reducer are
\begin{description}
  \item[reduceRexp, reduceFexp, reduceCCexp] These functions 
take an expression of type integer, floating point and condition code, 
translate them into machine code and return the 
register that holds the result. 
  \item[reduceStm] This function takes an MLTree statement and translates
it into machine code.  it also takes a second argument, which is the
list of annotations that should be attached to the statement.
  \item[operand] This function translates an \sml{rexp} into an
 \sml{operand} of the machine architecture.
  \item[reduceOperand] This function takes an operand of the machine
architecture and reduces it into an integer register.
  \item[addressOf] This function takes an \sml{rexp}, treats
it as an address expression and translates it into the appropriate
\sml{addresssing_mode} of the target architecture.
  \item[emit]  This function emits an instruction with attached annotations
to the instruction stream
  \item[instrStream, mltreeStream]  These are the instruction stream
and mltree streams that are currently bound to the extender.
\end{description}

\subsubsection{Extension Example}
Here is an example of how the extender mechanism can be used,
using the \sml{DSP_MLTREE} extensions defined in
Section~\ref{sec:mltree-extension}.   We need supply two
new functions, \verb|compileDSPStm| for compiling the \verb|FOR|
construct, and \verb|compileDSPRexp| for compiling the \verb|SUM|,
and saturated arithmetic instructions.

The first function, \sml{compileDSPStm}, is generic and simply
translates the \verb|FOR| loop into the appropriate branches.
Basically, we will translate \verb|FOR(|$i,start,stop,body$\verb|)| into
the following loop in pseudo code:
\begin{SML}
        limit = \(stop\)
        \(i\)  = \(start\)
        goto test
  loop: \(body\)
        \(i\) = \(i\) + 1
  test: if \(i\) <= limit goto loop
\end{SML}
This transformation can be implemented as follows:
\begin{SML}
 functor DSPMLTreeExtensionComp
    (structure I : DSP_INSTRUCTION_SET
     structure T : DSP_MLTREE
       sharing I.LabelExp = T.LabelExp
    ) =
 struct
   structure I = I
   structure T = T
   structure C = I.C

   type reducer = 
     (I.instruction,C.regmap,C.cellset,I.operand,I.addressing_mode) T.reducer
   
   fun mark(s, []) = s
     | mark(s, a::an) = mark(ANNOTATION(s, a), an)
   fun compileSext (REDUCER\{reduceStm, ...\}) 
      \{stm=FOR(i, start, stop, body), an\} =
   let val limit = C.newReg()
       val loop  = Label.newLabel ""
       val test  = Label.newLabel ""
   in  reduceStm(
         SEQ[MV(32, i, start),
             MV(32, limit, stop),
             JMP([], [LABEL(LabelExp.LABEL test)], []),
             LABEL loop,
             body,
             MV(32, i, ADD(32, REG(32, i), LI 1),
             LABEL test,
             mark(BCC([], 
                    CMP(32, LE, REG(32, i), REG(32, limit)), 
                      loop),
                  an),
            ]
      )
   end

   ...
\end{SML}
In this transformation, we have chosen to proprogate the annotation
\verb|an| into the branch constructor.

Assuming the target architecture that we are translated into contains
saturated arithmetic instructions \verb|SADD|, \verb|SSUB|, \verb|SMUL|
and \verb|SDIV|, the DSP extensions
\verb|SUM| and saturated arithmetic expressions can be handled as follows.
\begin{SML}
   fun compileRext (REDUCER\{reduceStm, reduceRexp, emit, ...\}) 
       \{ty, e, rd, an\} =
   (case (ty,e) of
      (_,T.SUM(i, a, b, exp)) =>
        reduceStm(SEQ[MV(ty, rd, LI 0),
                      FOR(i, a, b, 
                         mark(MV(ty, rd, ADD(ty, REG(ty, rd), exp)), an))
                     ]
                 )
   | (32,T.SADD(x,y)) => emit(I.SADD\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
   | (32,T.SSUB(x,y)) => emit(I.SSUB\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
   | (32,T.SMUL(x,y)) => emit(I.SMUL\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
   | (32,T.SDIV(x,y)) => emit(I.SDIV\{r1=reduceRexp x,r2=reduceRexp y,rd=rd\},an)
   | ...
   )

   fun compileFext _ _ = ()
   fun compileCCext _ _ = ()

  end
\end{SML}

Note that in this example, we have simply chosen to reduce
a \verb|SUM| expression into the high level \verb|FOR| construct.
Clearly, other translation schemes are possible.

\subsection{Instruction Selection Modules}
Here are the actual code for the various back ends:
\begin{enumerate}
  \item \mlrischref{sparc/mltree/sparc.sml}{Sparc}
  \item \mlrischref{hppa/mltree/hppa.sml}{PA-RISC}
  \item \mlrischref{alpha/mltree/alpha.sml}{Alpha}
  \item \mlrischref{ppc/mltree/ppc.sml}{Power PC}
  \item \mlrischref{x86/mltree/x86.sml}{X86}
  \item C6xx 
\end{enumerate}