File: pegrammar.inc

package info (click to toggle)
tcllib 1.14-dfsg-3%2Bdeb7u1
  • links: PTS
  • area: main
  • in suites: wheezy
  • size: 33,036 kB
  • sloc: tcl: 148,302; ansic: 14,067; sh: 10,320; xml: 1,766; yacc: 753; pascal: 551; makefile: 129; perl: 84; f90: 84; python: 33; ruby: 13; php: 11
file content (114 lines) | stat: -rw-r--r-- 3,255 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
[section {PEG serialization format}]

Here we specify the format used by the Parser Tools to serialize
Parsing Expression Grammars as immutable values for transport,
comparison, etc.

[para]

We distinguish between [term regular] and [term canonical]
serializations.

While a PEG may have more than one regular serialization only exactly
one of them will be [term canonical].


[list_begin definitions][comment {-- serializations --}]
[def {regular serialization}]

[list_begin enumerated][comment {-- regular points --}]
[enum]
The serialization of any PEG is a nested Tcl dictionary.

[enum]
This dictionary holds a single key, [const pt::grammar::peg], and its
value. This value holds the contents of the grammar.

[enum]
The contents of the grammar are a Tcl dictionary holding the set of
nonterminal symbols and the starting expression. The relevant keys and
their values are

[list_begin definitions][comment {-- grammar keywords --}]
[def [const rules]]

The value is a Tcl dictionary whose keys are the names of the
nonterminal symbols known to the grammar.

[list_begin enumerated][comment {-- nonterminals --}]
[enum]
Each nonterminal symbol may occur only once.

[enum]
The empty string is not a legal nonterminal symbol.

[enum]
The value for each symbol is a Tcl dictionary itself. The relevant
keys and their values in this dictionary are

[list_begin definitions][comment {-- nonterminal keywords --}]
[def [const is]]

The value is the serialization of the parsing expression describing
the symbols sentennial structure, as specified in the section
[sectref {PE serialization format}].

[def [const mode]]

The value can be one of three values specifying how a parser should
handle the semantic value produced by the symbol.

[include ../modes.inc]
[list_end][comment {-- nonterminal keywords --}]
[list_end][comment {-- nonterminals --}]

[def [const start]]

The value is the serialization of the start parsing expression of the
grammar, as specified in the section [sectref {PE serialization format}].

[list_end][comment {-- grammar keywords --}]

[enum]
The terminal symbols of the grammar are specified implicitly as the
set of all terminal symbols used in the start expression and on the
RHS of the grammar rules.


[list_end][comment {-- regular points --}]

[def {canonical serialization}]

The canonical serialization of a grammar has the format as specified
in the previous item, and then additionally satisfies the constraints
below, which make it unique among all the possible serializations of
this grammar.

[list_begin enumerated][comment {-- canonical points --}]
[enum]

The keys found in all the nested Tcl dictionaries are sorted in
ascending dictionary order, as generated by Tcl's builtin command
[cmd {lsort -increasing -dict}].

[enum]

The string representation of the value is the canonical representation
of a Tcl dictionary. I.e. it does not contain superfluous whitespace.

[list_end][comment {-- canonical points --}]
[list_end][comment {-- serializations --}]

[subsection Example]

Assuming the following PEG for simple mathematical expressions

[para]
[include ../example/expr_peg.inc]
[para]

then its canonical serialization (except for whitespace) is

[para]
[include ../example/expr_serial.inc]
[para]