File: me_ast.man

package info (click to toggle)
tcllib 2.0%2Bdfsg-4
  • links: PTS
  • area: main
  • in suites: sid, trixie
  • size: 83,572 kB
  • sloc: tcl: 306,798; ansic: 14,272; sh: 3,035; xml: 1,766; yacc: 1,157; pascal: 881; makefile: 124; perl: 84; f90: 84; python: 33; ruby: 13; php: 11
file content (134 lines) | stat: -rw-r--r-- 4,030 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
[comment {-*- tcl -*- doctools manpage}]
[manpage_begin grammar::me_ast n 0.2]
[keywords {abstract syntax tree}]
[keywords AST]
[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}]
[moddesc   {Grammar operations and usage}]
[titledesc {Various representations of ASTs}]
[category  {Grammars and finite automata}]
[description]

This document specifies various representations for the

[term {abstract syntax tree}]s (short [term AST]) generated by
instances of ME virtual machines, independent of variant.

Please go and read the document [syscmd grammar::me_intro] first if
you do not know what a ME virtual machine is.

[para]

ASTs and all the representations we specify distinguish between two
types of nodes, namely:

[para]
[list_begin definitions]

[def Terminal]

Terminal nodes refer to the terminal symbols found in the token
stream. They are always leaf nodes. I.e. terminal nodes never have
children.

[def Nonterminal]

Nonterminal nodes represent a nonterminal symbol of the grammar used
during parsing. They can occur as leaf and inner nodes of the
tree.

[list_end]
[para]

Both types of nodes carry basic range information telling a user which
parts of the input are covered by the node by providing the location
of the first and last tokens found within the range. Locations are
provided as non-negative integer offsets from the beginning of the
token stream, with the first token found in the stream located at
offset 0 (zero).

[para]

The root of an AS tree can be either a terminal or nonterminal node.

[section {AST VALUES}]

This representation of ASTs is a Tcl list. The main list represents
the root node of the tree, with the representations of the children
nested within.

[para]

Each node is represented by a single Tcl list containing three or more
elements. The first element is either the empty string or the name of
a nonterminal symbol (which is never the empty string). The second and
third elements are then the locations of the first and last tokens.

Any additional elements after the third are then the representations
of the children, with the leftmost child first, i.e. as the fourth
element of the list representing the node.

[section {AST OBJECTS}]

In this representation an AST is represented by a Tcl object command
whose API is compatible to the tree objects provided by the package
[package struct::tree]. I.e it has to support at least all of the
methods described by that package, and may support more.

[para]

Because of this the remainder of the specifications is written using
the terms of [package struct::tree].

[para]

Each node of the AST directly maps to a node in the tree object. All
data beyond the child nodes, i.e. node type and input locations, are
stored in attributes of the node in the tree object. They are:

[list_begin definitions]
[def type]

The type of the AST node. The recognized values are [const terminal]
and [const nonterminal].

[def range]

The locations of the first and last token of the terminal data in the
input covered by the node. This is a list containing two locations.

[def detail]

This attribute is present only for nonterminal nodes. It contains the
name of the nonterminal symbol stored in the node.

[list_end]

[section {EXTENDED AST OBJECTS}]

Extended AST objects are like AST objects, with additional
information.

[list_begin definitions]

[def detail]

This attribute is now present at all nodes. Its contents are unchanged
for nonterminal nodes. For terminal nodes it contains a list
describing all tokens from the input which are covered by the node.

[para]

Each element of the list contains the token name, the associated
lexeme attribute, line number, and column index, in this order.

[def range_lc]

This new attribute is defined for all nodes, and contains the
locations from attribute [term range] translated into line number and
column index. Lines are counted from 1, columns are counted from 0.

[list_end]

[vset CATEGORY grammar_me]
[include ../common-text/feedback.inc]
[manpage_end]