File: prec.yo

package info (click to toggle)
bisonc%2B%2B 6.09.02-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 5,984 kB
  • sloc: cpp: 9,375; ansic: 1,505; fortran: 1,134; makefile: 1,062; sh: 526; yacc: 84; lex: 60
file content (159 lines) | stat: -rw-r--r-- 5,437 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
Syntax: bf(%prec) tt(token) 

    The construction bf(%prec) tt(token) may be used in non-empty production
rules to overrule the actual precedence of an operator in those
production rules. Well known is the construction
                verb(
    expression:
        '-' expression %prec UMINUS
        {
            ...
        }
                ) 
    Here, the default precedence and precedence of the tt(`-') token as the
subtraction operator is overruled by the precedence and precedence of the
tt(UMINUS) token, which is frequently defined as:
                verb(
    %right UMINUS
                )
    E.g., a list of arithmetic operators could consists of:
        verb(
    %left '+' '-'
    %left '*' '/' '%'
    %right UMINUS
        )
    giving tt(* /) and tt(%) a higher precedence than tt(+) and tt(-),
ensuring at the same time that tt(UMINUS) is given both the highest precedence
and a right-associativity.

In the above production rule the operator order would result in the
construction 
        verb(
    '-' expression
        )
    being evaluated from right to left, having a higher precedence than either
the multiplication or the addition operators.

The tt(%prec) directive associates priorities to (non-empty) rules. These
priorities are interpreted whenever there are (shift-reduce) conflicts. If
there are no conflicts, priorities are not required, and are consequently
ignored.

When the parser analyzes the above grammar, a conflict em(is)
encountered. Consider the following simple grammar, in which only the minus
(tt('-')) operator is used, albeit in beinary and unary form:

        verb(
    %token NR
    %left '-'
    %right UNARY

    expr:
        NR
    |
        expr '-' expr
    |
        '-' expr %prec UNARY
    ;
        )
    When analuzing this grammar, bic() defines em(states) (cf. chapter
ref(ALGORITHM)) defining what to do when encountering certain input
tokens. Each possibility is defined by an em(item), in which a dot (tt(.)) is
commonly used to show to which point a production rule has been
recognized. For the above grammar one such state looks like this:
        verb(
    0: expr -> '-' expr  .        { '-' <EOF> }
    1: expr -> expr  . '-' expr
        )
    The elements between parentheses define the em(look-ahead) tokens: the
token that may appear next for em(reducible) rules. Item 0 is such a reducible
rule, and it is used to reduce tt('-' expr) to an expression (tt(expr)).

    The second item shows the production rule defining the binary minus
operator. Its left-hand side expression has been recognized, and the parser
expects to see a minus operator next. 

    The conflict is caused by the expected minus operator in item 1, and a
minus operator that may follow item 0. As there is only one look-ahead symbol
(since bic() can only handle LALR(1) grammars) the grammer contains a
shift-reduce conflict: shift, and continue with item 1; or reduce, and
continue with item 0. In this case, tt(%prec) solves the issue, by giving item
0 a higher precedence than item 1 (whose precedence is equal to the precedence
of its first terminal token, which is the binary minus operator's
precedence). 

    Although never encountered in real life, it's also possible to give the
unary minus operator a em(lower) priority than the binary minus operator. The
grammar, in this case, looks like this:
        verb(
    %token NR
    %right UNARY
    %left '-'

    expr:
        NR
    |
        expr '-' expr
    |
        '-' expr %prec UNARY
    ;
        )

    With this grammar we encounter a state with these two items:
        verb(
    0:  expr -> '-' expr  .   { <EOF> } 
    1:  expr -> expr  . '-' expr
        )
    Here, the conflict no longer manifests itself, as the minus operator no
longer appears in item 0's look-ahead set. The resulting parser will, when
encountering a minus, shift the minus, and proceed according to item 1, and
when anything else is encountered reduce the tt('-' expr) production to
tt(expr). In real life this means that an expression like tt(-4 - 3)
evaluates to -1. 

    To illustrate a situation where tt(%prec) won't work consider this
grammar: 
        verb(
    %token NR
    %left '-'
    %right POSTFIX

    expr:
        NR
    |
        expr '-' expr
    |
        expr '-' %prec POSTFIX
    ;
        )
    When this grammar is analyzed the following state is encountered:
        verb(
    0:  expr -> expr '-'  . expr   
    1:  expr -> expr '-'  .         { '-' <EOF> }
    2:  expr ->  . NR   
    3:  expr ->  . expr '-' expr   
    4:  expr ->  . expr '-'   
      )
    To appreciate why tt(%prec) doesn't work here, consider the various
look-ahead tokens. For items 0, 3, and 4 the look-ahead token is the
non-terminal tt(expr); for item 2 the look-ahead token is the terminal tt(NR),
and for item 1, handling the postfix minus operator, it is a minus
character. Thus, there  isn't any conflict between the shiftable items and the
reducible item 1, and consequently the tt(%prec) specification isn't
used. Any attempt to define a grammar handling a postfix minus operator will
fail. A common solution consists of defining a separate operator, explicitly
giving it its appropriate priority and associativity. E.g.,
        verb(
    %token NR
    %left '-'
    %right '_'  // underscore

    expr:
        NR
    |
        expr '-' expr
    |
        expr '_'    // underscore as postfix minus
    ;
        )