1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
|
Syntax: bf(%prec) tt(token)
The construction bf(%prec) tt(token) may be used in non-empty production
rules to overrule the actual precedence of an operator in those
production rules. Well known is the construction
verb(
expression:
'-' expression %prec UMINUS
{
...
}
)
Here, the default precedence and precedence of the tt(`-') token as the
subtraction operator is overruled by the precedence and precedence of the
tt(UMINUS) token, which is frequently defined as:
verb(
%right UMINUS
)
E.g., a list of arithmetic operators could consists of:
verb(
%left '+' '-'
%left '*' '/' '%'
%right UMINUS
)
giving tt(* /) and tt(%) a higher precedence than tt(+) and tt(-),
ensuring at the same time that tt(UMINUS) is given both the highest precedence
and a right-associativity.
In the above production rule the operator order would result in the
construction
verb(
'-' expression
)
being evaluated from right to left, having a higher precedence than either
the multiplication or the addition operators.
The tt(%prec) directive associates priorities to (non-empty) rules. These
priorities are interpreted whenever there are (shift-reduce) conflicts. If
there are no conflicts, priorities are not required, and are consequently
ignored.
When the parser analyzes the above grammar, a conflict em(is)
encountered. Consider the following simple grammar, in which only the minus
(tt('-')) operator is used, albeit in beinary and unary form:
verb(
%token NR
%left '-'
%right UNARY
expr:
NR
|
expr '-' expr
|
'-' expr %prec UNARY
;
)
When analuzing this grammar, bic() defines em(states) (cf. chapter
ref(ALGORITHM)) defining what to do when encountering certain input
tokens. Each possibility is defined by an em(item), in which a dot (tt(.)) is
commonly used to show to which point a production rule has been
recognized. For the above grammar one such state looks like this:
verb(
0: expr -> '-' expr . { '-' <EOF> }
1: expr -> expr . '-' expr
)
The elements between parentheses define the em(look-ahead) tokens: the
token that may appear next for em(reducible) rules. Item 0 is such a reducible
rule, and it is used to reduce tt('-' expr) to an expression (tt(expr)).
The second item shows the production rule defining the binary minus
operator. Its left-hand side expression has been recognized, and the parser
expects to see a minus operator next.
The conflict is caused by the expected minus operator in item 1, and a
minus operator that may follow item 0. As there is only one look-ahead symbol
(since bic() can only handle LALR(1) grammars) the grammer contains a
shift-reduce conflict: shift, and continue with item 1; or reduce, and
continue with item 0. In this case, tt(%prec) solves the issue, by giving item
0 a higher precedence than item 1 (whose precedence is equal to the precedence
of its first terminal token, which is the binary minus operator's
precedence).
Although never encountered in real life, it's also possible to give the
unary minus operator a em(lower) priority than the binary minus operator. The
grammar, in this case, looks like this:
verb(
%token NR
%right UNARY
%left '-'
expr:
NR
|
expr '-' expr
|
'-' expr %prec UNARY
;
)
With this grammar we encounter a state with these two items:
verb(
0: expr -> '-' expr . { <EOF> }
1: expr -> expr . '-' expr
)
Here, the conflict no longer manifests itself, as the minus operator no
longer appears in item 0's look-ahead set. The resulting parser will, when
encountering a minus, shift the minus, and proceed according to item 1, and
when anything else is encountered reduce the tt('-' expr) production to
tt(expr). In real life this means that an expression like tt(-4 - 3)
evaluates to -1.
To illustrate a situation where tt(%prec) won't work consider this
grammar:
verb(
%token NR
%left '-'
%right POSTFIX
expr:
NR
|
expr '-' expr
|
expr '-' %prec POSTFIX
;
)
When this grammar is analyzed the following state is encountered:
verb(
0: expr -> expr '-' . expr
1: expr -> expr '-' . { '-' <EOF> }
2: expr -> . NR
3: expr -> . expr '-' expr
4: expr -> . expr '-'
)
To appreciate why tt(%prec) doesn't work here, consider the various
look-ahead tokens. For items 0, 3, and 4 the look-ahead token is the
non-terminal tt(expr); for item 2 the look-ahead token is the terminal tt(NR),
and for item 1, handling the postfix minus operator, it is a minus
character. Thus, there isn't any conflict between the shiftable items and the
reducible item 1, and consequently the tt(%prec) specification isn't
used. Any attempt to define a grammar handling a postfix minus operator will
fail. A common solution consists of defining a separate operator, explicitly
giving it its appropriate priority and associativity. E.g.,
verb(
%token NR
%left '-'
%right '_' // underscore
expr:
NR
|
expr '-' expr
|
expr '_' // underscore as postfix minus
;
)
|