1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
|
<HTML>
<HEAD>
<!-- This HTML file has been created by texi2html 1.29
from ../tnf/tp.tnf on 12 Febuary 2003 -->
<TITLE>Tree Parsing - The Tree To Be Parsed</TITLE>
</HEAD>
<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#0000EE" VLINK="#551A8B" ALINK="#FF0000" BACKGROUND="gifs/bg.gif">
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0" VALIGN=BOTTOM>
<TR VALIGN=BOTTOM>
<TD WIDTH="160" VALIGN=BOTTOM><IMG SRC="gifs/elilogo.gif" BORDER=0> </TD>
<TD WIDTH="25" VALIGN=BOTTOM><img src="gifs/empty.gif" WIDTH=25 HEIGHT=25></TD>
<TD ALIGN=LEFT WIDTH="600" VALIGN=BOTTOM><IMG SRC="gifs/title.gif"></TD>
</TR>
</TABLE>
<HR size=1 noshade width=785 align=left>
<TABLE BORDER=0 CELLSPACING=2 CELLPADDING=0>
<TR>
<TD VALIGN=TOP WIDTH="160">
<h4>General Information</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="index.html">Eli: Translator Construction Made Easy</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="gindex_toc.html">Global Index</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="faq_toc.html" >Frequently Asked Questions</a> </td></tr>
</table>
<h4>Tutorials</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="EliRefCard_toc.html">Quick Reference Card</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="novice_toc.html">Guide For new Eli Users</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="news_toc.html">Release Notes of Eli</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="nametutorial_toc.html">Tutorial on Name Analysis</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="typetutorial_toc.html">Tutorial on Type Analysis</a></td></tr>
</table>
<h4>Reference Manuals</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="ui_toc.html">User Interface</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="pp_toc.html">Eli products and parameters</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lidoref_toc.html">LIDO Reference Manual</a></td></tr>
</table>
<h4>Libraries</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lib_toc.html">Eli library routines</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="modlib_toc.html">Specification Module Library</a></td></tr>
</table>
<h4>Translation Tasks</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lex_toc.html">Lexical analysis specification</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="syntax_toc.html">Syntactic Analysis Manual</a></td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="comptrees_toc.html">Computation in Trees</a></td></tr>
</table>
<h4>Tools</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="lcl_toc.html">LIGA Control Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="show_toc.html">Debugging Information for LIDO</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="gorto_toc.html">Graphical ORder TOol</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="fw_toc.html">FunnelWeb User's Manual</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="ptg_toc.html">Pattern-based Text Generator</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="deftbl_toc.html">Property Definition Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="oil_toc.html">Operator Identification Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="tp_toc.html">Tree Grammar Specification Language</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="clp_toc.html">Command Line Processing</a> </td></tr>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="cola_toc.html">COLA Options Reference Manual</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="idem_toc.html">Generating Unparsing Code</a> </td></tr>
</table>
<p>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="mon_toc.html">Monitoring a Processor's Execution</a> </td></tr>
</table>
<h4>Administration</h4>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0>
<tr valign=top><td><img src="gifs/gelbekugel.gif" WIDTH=7 HEIGHT=7 ALT=" o"> </td><td><a href="sysadmin_toc.html">System Administration Guide</a> </td></tr>
</table>
<HR WIDTH="100%">
<CENTER> <A HREF="mailto:elibugs@cs.colorado.edu"><IMG SRC="gifs/button_mail.gif" NOSAVE BORDER=0 HEIGHT=32 WIDTH=32></A><A HREF="mailto:elibugs@cs.colorado.edu">Questions, Comments, ....</A></CENTER>
</TD>
<TD VALIGN=TOP WIDTH="25"><img src="gifs/empty.gif" WIDTH=25 HEIGHT=25></TD>
<TD VALIGN=TOP WIDTH="600">
<H1>Tree Parsing</H1>
<P>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="tp_2.html"><IMG SRC="gifs/next.gif" ALT="Next Chapter" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="tp_toc.html"><IMG SRC="gifs/up.gif" ALT="Table of Contents" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT="">
<HR size=1 noshade width=600 align=left>
<H1><A NAME="SEC1" HREF="tp_toc.html#SEC1">The Tree To Be Parsed</A></H1>
<P>
Problems amenable to solution by tree parsing involve hierarchical
relationships among entities.
Each entity is represented by a node in a tree, and the structure of the
tree represents the hierarchical relationship among the entities
represented by its nodes.
<P>
The relationships are such that nodes corresponding to entities of a
particular kind always have the same number of children.
No constraint is placed on the <EM>kinds</EM> of children a particular
kind of node can have; only the <EM>number</EM> of children is fixed.
This tree parser accepts only trees in which each node has no more
than two children.
<P>
An entity like an integer addition operator is completely characterized by
the kind of node representing it.
Integer constants, on the other hand, are not completely characterized by
the fact that they are represented by <CODE>IntDenotation</CODE> nodes.
Each <CODE>IntDenotation</CODE> node must therefore carry the constant's value
as an
<A NAME="IDX1"></A>
<DFN>attribute</DFN>.
This tree parser allows an arbitrary number of attributes of arbitrary type
to be attached to each node.
<P>
A user builds the tree describing the hierarchical relationships among the
entities of interest by invoking specific constructor functions.
The constructor used to build a particular node depends on the number of
children and the number and type of attributes required by that node.
<P>
This section begins by formalizing the structure of a tree to be parsed.
It then characterizes the attributes, and finally explains the naming
conventions for the constructors.
<P>
<H2><A NAME="SEC2" HREF="tp_toc.html#SEC2">Tree Structure</A></H2>
<P>
The tree structure is defined in terms of a set of symbols that constitute a
<A NAME="IDX2"></A>
<DFN>ranked alphabet</DFN>:
Each symbol has an associated
<A NAME="IDX3"></A>
<DFN>arity</DFN> that determines the number of
children a node representing the symbol will have.
Each node of the tree represents a symbol of the ranked alphabet, and the
number of children of a node is the arity of the symbol it represents.
Any such tree is legal; there is no constraint on the symbols represented
by the children of a node, only on their number.
<P>
The ranked alphabet is extracted from the specification supplied by the
user (see <A HREF="tp_2.html#SEC5">The Tree Patterns</A>).
The translator verifies that the arity of each symbol is consistent over
the specification.
<P>
Each symbol of the ranked alphabet denotes a particular kind of entity.
For example, here is a set of symbols forming a ranked alphabet that could
be the basis of a tree describing simple
<A NAME="IDX4"></A>
arithmetic expressions:
<P>
<PRE>
IntegerVal FloatingVal IntegerVar FloatingVar
Negative
Plus Minus Star Slash
</PRE>
<P>
The symbols in the first row have arity 0, and are therefore represented by
<A NAME="IDX5"></A>
leaves of the tree.
<CODE>Negative</CODE> has arity 1, and the symbols in the third row all have arity
2.
Each symbol has the obvious meaning when describing an expression:
<P>
<DL COMPACT>
<DT><SAMP>`3.1415'</SAMP>
<DD><CODE>FloatingVal</CODE>
<DT><SAMP>`-3'</SAMP>
<DD><CODE>Negative(IntegerVal)</CODE>
<DT><SAMP>`k-3'</SAMP>
<DD><CODE>Minus(IntegerVar,IntegerVal)</CODE>
<DT><SAMP>`(a*7)/(j+2)'</SAMP>
<DD><CODE>Slash(Star(FloatingVar,IntegerVal),Plus(IntegerVar,IntegerVal))</CODE>
</DL>
The notation here the normal algebraic one:
A term is either a symbol of arity 0, or it is a symbol of arity <VAR>k</VAR>
followed by a parenthesized list of <VAR>k</VAR> terms.
Each term corresponds to a node of the tree.
<P>
A tree describing the expression in the first line has one node, representing
the symbol <CODE>FloatingVal</CODE>.
Because <CODE>FloatingVal</CODE> has arity 0, that node has no children.
(The value <SAMP>`3.1415'</SAMP> would appear as an attribute of the node,
see <A HREF="tp_1.html#SEC3">Decorating Nodes</A>.)
<P>
A tree describing the expression in the last line has seven nodes.
Four are leaves because the symbols they represent have arity 0;
each of the remaining three has two children because the symbol it
represents has arity 2.
<P>
A tree is not acceptable to the tree parser described in this document
if any node has more than two children.
Thus no symbol of the ranked alphabet may have arity greater than 2.
That is not a significant restriction, since any tree can be represented as
a binary tree.
<P>
Suppose that we want to use trees to describe the following C expressions:
<A NAME="IDX6"></A>
<A NAME="IDX7"></A>
<P>
<PRE>
<SAMP>`i>j ? i-j : j-i'</SAMP>
<SAMP>`(i=1, j=3, k=5, l+3) + 7'</SAMP>
</PRE>
<P>
Although <CODE>?:</CODE> is usually thought of as a ternary operator, its
semantics provide a natural decomposition into a condition and two
alternatives:
<P>
<PRE>
Conditional(
Greater(IntegerVar,IntegerVar),
Alternatives(Minus(IntegerVar,IntegerVar),Minus(IntegerVar,IntegerVar)))
</PRE>
<P>
The comma expression might have any number of components, but they can
simply be accumulated from left to right:
<P>
<PRE>
Plus(
Comma(
Comma(
Comma(Assign(IntegerVar,IntegerVal),Assign(IntegerVar,IntegerVal)),
Assign(IntegerVar,IntegerVal)),
Plus(IntegerVar,IntegerVal)),
IntegerVal)
</PRE>
<P>
<H2><A NAME="SEC3" HREF="tp_toc.html#SEC3">Decorating Nodes</A></H2>
<P>
In addition to its arity, each symbol in the ranked alphabet may be
associated with a fixed number of
<A NAME="IDX8"></A>
attributes.
Each attribute has a specific type.
The attributes decorate the nodes of the tree,
but they do not contribute any structural information.
<P>
In the examples of the previous section, the symbols of arity 0
did not provide all of the necessary information about the leaves.
Each symbol of arity 0 specified <EM>what</EM> the leaf was,
but not <EM>which</EM> value of that kind it represented.
This is often the case with leaves, so a leaf usually has an associated
attribute.
Interior nodes, on the other hand, seldom need attributes.
<P>
Each attribute must be given a value of the proper type
when the node corresponding to the symbol is created.
This value will not affect the tree parse in any way, but will be passed
unchanged to the function implementing the action associated with the rule
used in the derivation of the node.
Thus attributes are a mechanism for passing information through the tree
parse.
<P>
<A NAME="IDX9"></A>
<A NAME="IDX10"></A>
<A NAME="IDX11"></A>
<H2><A NAME="SEC4" HREF="tp_toc.html#SEC4">Node Construction Functions</A></H2>
<P>
Each node of the tree to be parsed is constructed by invoking a function
whose name and parameters depend on the number of children and
attributes of the node.
The name always begins with the characters <CODE>TP_</CODE>, followed by the
digit representing the number of children.
If there are attributes, the attribute types follow.
Each attribute type is preceded by an underscore.
<P>
The set of constructors is determined from the specification supplied by
the user (see <A HREF="tp_2.html#SEC5">The Tree Patterns</A>).
The translator verifies that each occurrence of a symbol is consistent
with respect to the number of children and types of attributes.
<P>
Consider the simple expression trees discussed above
(see <A HREF="tp_1.html#SEC2">Tree Structure</A>):
<P>
<PRE>
IntegerVal FloatingVal IntegerVar FloatingVar
Negative
Plus Minus Star Slash
</PRE>
<P>
Suppose that integer and floating-point values are represented by the integer
indexes of their denotations in the string table
(see <A HREF="lib_1.html#SEC6">Character String Storage of Library Reference</A>),
and variables are represented by definition table keys
(see <A HREF="deftbl_1.html#SEC1">The Definition Table Module of Property Definition Language</A>).
In that case each tree node representing either <CODE>IntegerVal</CODE> or
<CODE>FloatingVal</CODE> would be decorated with an <CODE>int</CODE>-valued attribute;
each tree node representing either <CODE>IntegerVar</CODE> or <CODE>FloatingVar</CODE>
would be decorated with a <CODE>DefTableKey</CODE>-valued attribute.
No other node would have attributes, and four tree construction functions
would be created by the translator:
<P>
<A NAME="IDX12"></A>
<U>:</U> TPNode <B>TP_0_int</B> <I>(int <VAR>symbol</VAR>, int <VAR>attr</VAR>)</I><P>
Return a <VAR>symbol</VAR> leaf decorated with <VAR>attr</VAR>,
of type <CODE>int</CODE>
<P>
<A NAME="IDX13"></A>
<U>:</U> TPNode <B>TP_0_DefTableKey</B> <I>(int <VAR>symbol</VAR>, DefTableKey <VAR>attr</VAR>)</I><P>
Return a <VAR>symbol</VAR> leaf decorated with <VAR>attr</VAR>,
of type <CODE>DefTableKey</CODE>
<P>
<A NAME="IDX14"></A>
<U>:</U> TPNode <B>TP_1</B> <I>(int <VAR>symbol</VAR>, TPNode <VAR>child</VAR>)</I><P>
Return an undecorated <VAR>symbol</VAR> node with one child
<P>
<A NAME="IDX15"></A>
<U>:</U> TPNode <B>TP_2</B> <I>(int <VAR>symbol</VAR>, TPNode <VAR>left</VAR>, TPNode <VAR>right</VAR>)</I><P>
Return an undecorated <VAR>symbol</VAR> node with two children
<P>
Here's how the tree describing the expression <SAMP>`-i+1'</SAMP> could be
constructed:
<P>
<PRE>
TP_2(
Plus,
TP_1(Negative, TP_0_DefTableKey(IntegerVar, keyOfi)),
TP_0_int(IntegerVal, indexOf1))
</PRE>
<P>
Here <CODE>keyOfi</CODE> is a variable holding the definition table key
associated with variable <CODE>i</CODE> and <CODE>indexOf1</CODE> is a variable holding
the string table index of the denotation for <CODE>1</CODE>.
<P>
All tree construction functions return values of type <CODE>TPNode</CODE>.
Attributes can be attached to nodes with children, although there are no
such nodes in the example above.
Here's the constructor invocation for a node with two children and
two integer attributes:
<P>
<PRE>
TP_2_int_int(Symbol, child1, child2, attr1, attr2);
</PRE>
<P>
<HR size=1 noshade width=600 align=left>
<P>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="tp_2.html"><IMG SRC="gifs/next.gif" ALT="Next Chapter" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT=""><A HREF="tp_toc.html"><IMG SRC="gifs/up.gif" ALT="Table of Contents" BORDER="0"></A>
<IMG SRC="gifs/empty.gif" WIDTH=25 HEIGHT=25 ALT="">
<HR size=1 noshade width=600 align=left>
</TD>
</TR>
</TABLE>
</BODY></HTML>
|