File: conflicts.yo

package info (click to toggle)
bisonc%2B%2B 6.09.02-1
links: PTS, VCS
area: main
in suites: forky, sid
size: 5,984 kB
sloc: cpp: 9,375; ansic: 1,505; fortran: 1,134; makefile: 1,062; sh: 526; yacc: 84; lex: 60
file content (91 lines) | stat: -rw-r--r-- 4,099 bytes
parent folder | download | duplicates (5)
Suppose we are parsing a language which has tt(if) and tt(if-else)
statements, with a pair of rules like this:
        verb(
    if_stmt:
        IF '(' expr ')' stmt
    | 
        IF '(' expr ')' stmt ELSE stmt
    ;
        )
    Here we assume that tt(IF) and tt(ELSE) are terminal symbols for specific
keywords, and that tt(expr) and tt(stmnt) are defined nonterminals.

When the tt(ELSE) token is read and becomes the look-ahead token, the contents
of the stack (assuming the input is valid) are just right for em(reduction) by
the first rule. But it is also legitimate to em(shift) the tt(ELSE), because
that would lead to eventual reduction by the second rule.

This situation, where either a shift or a reduction would be valid, is called
a tt(shift/reduce) conflict. B() is designed to resolve these conflicts
by em(implementing) a shift, unless otherwise directed by operator precedence
declarations. To see the reason for this, let's contrast it with the other
alternative.

Since the parser prefers to shift the tt(ELSE), the result is to attach the
em(else-clause) to the innermost if-statement, making these two inputs
equivalent:
        verb(
    if (x) if (y) then win(); else lose();

    if (x) 
    {
        if (y) then win(); else lose(); 
    }
        )
    But if the parser would perform a em(reduction) whenever possible rather
than a em(shift), the result would be to attach the em(else-clause) to the
outermost if-statement, making these two inputs equivalent:
        verb(
    if (x) if (y) then win(); else lose();

    if (x) 
    {
        if (y) win(); 
    }
    else 
        lose();
        )
    The conflict exists because the grammar as written is em(ambiguous):
em(either) parsing of the simple nested if-statement is legitimate. The
established convention is that these ambiguities are resolved by attaching the
else-clause to the innermost if-statement; this is what b() accomplishes
by implementing a shift rather than a reduce. This
particular ambiguity was first encountered in the specifications of Algol 60
and is called the em(dangling else) ambiguity.

To avoid warnings from b() about predictable, legitimate shift/reduce
conflicts, use the tt(%expect n) directive. There will be no warning as long
as the number of shift/reduce conflicts is exactly tt(n). See section
ref(EXPECT).

The definition of tt(if_stmt) above is solely to blame for the conflict, but
the plain tt(stmnt) rule, consisting of two recursive alternatives will of
course never be able to match actual input, since there's no way for the
grammar to eventually derive a sentence this way. Adding one non-recursive
alternative is enough to convert the grammar into one that em(does) derive
sentences. Here is a complete b() input file that actually shows the
conflict:
    verbinclude(demos/dangling)

Looking again at the dangling else problem note that there are multiple ways
to handle tt(stmnt) productions. Depending on the particular input that is
provided it could 
either be reduced to a tt(stmt) or the parser could continue to consume input
by processing an tt(ELSE) token, eventually resulting in the recognition of
tt(IF '(' VAR ')' stmt ELSE stmt) as a tt(stmt). 

There is little we can do but resorting to tt(%expect) to handle the dangling
else problem. The default handling is what most people intuitively expect and
so in this case using tt(%expect 1) is an easy way to prevent b() from
reporting a shift/reduce conflict. But shift/reduce conflicts are most often
solved by specifying disambiguating rules specifying priorities or
associations, usually in the context of arithmetic expressions, as discussed
in the next sections.

However, shift-reduce conflicts can also be observed in grammars where a state
contains items that could be reduced to a certain nonterminal and items in
which a shift is possible in an item of a production rule of a completely
different nonterminal. Here is an example of such a grammar:
    verbinclude(demos/peculiar)
  Why these grammars show shift reduce conflicts and how these are solved is
discussed in the next section.