File: recovery.yo

package info (click to toggle)
bisonc%2B%2B 6.09.02-1
links: PTS, VCS
area: main
in suites: forky, sid
size: 5,984 kB
sloc: cpp: 9,375; ansic: 1,505; fortran: 1,134; makefile: 1,062; sh: 526; yacc: 84; lex: 60
file content (232 lines) | stat: -rw-r--r-- 5,865 bytes
parent folder | download | duplicates (5)
B() implements a simple error recovery mechanism. When the tt(lookup())
function cannot find an action for the current token in the current state it
throws an tt(UNEXPECTED_TOKEN_) exception. 

This exception is caught by the parsing function, calling the
tt(errorRecovery()) member function. By default, this member function 
terminates the parsing process. The non-default recovery procedure is
available once an tt(error) token is used in a production rule. When the
parsing process throws bf(UNEXPECTED_TOKEN_) the recovery procedure is
started (i.e., it is started whenever a syntactical error is encountered or
tt(ERROR()) is called).

The recovery procedure consists of 
    itemization(
    it() looking for the first state on the state-stack having an
error-production, followed by:
    it() handling all state transitions that are possible without retrieving a
terminal token. 
    it() then, in the state requiring a terminal token and starting
with the initial unexpected token (3) all subsequent terminal tokens are
ignored until a token is retrieved which is a continuation token in that
state.
    )

If the error recovery procedure fails (i.e., if no acceptable token is
ever encountered) error recovery falls back to the default recovery
mode (i.e., the parsing process is terminated).

Not all syntactic errors are always reported: the option
link(--required-tokens)(REQUIRED) can be used to specify the minimum number of
tokens that must have been successfully processed before another syntactic
error is reported (and counted).

The option link(--error-verbose)(ERRORVERBOSE) may be specified to obtain the
contents of the state stack when a syntactic error is reported.

The example grammar may be provided with an tt(error) production rule:
        verb(
    %token NR
    
    %left '+'
    
    %%
    
    start:
        start expr
    |
        // empty
    ;
    
    expr:
        error
    |
        NR
    |
        expr '+' expr
    ;
        )
    The resulting grammar has one additional state (handling the error
production) and one state in which the tt(ERR_ITEM) flag has been set. When
and error is encountered, this state obtains tokens until a token having a
valid continuation was received, after which normal processing continues. 

In the parser's verbose output (using the tt(-V) option) the various grammar
rules and state transition tables are shown. The debug output below refers to
this information.

The rules are:
    verb(
    1: start ->  start expr
    2: start ->  <empty>
    3: expr (errTok_) ->  errTok_
    4: expr (NR) ->  NR
    5: expr ('+') ->  expr '+' expr
    6: start_$ ->  start
    )

The state-transitions are:
    verb(
State 0:
6: start_$ ->  . start 
  0:   On start to state 1
  Reduce by 2: start ->  . 


State 1:
6: start_$ -> start  . 
1: start -> start  . expr 
  0:   On expr to state 2
  1:   On errTok_ to state 3
  2:   On NR to state 4


State 2:
1: start -> start expr  . 
5: expr -> expr  . '+' expr 
  0:   On '+' to state 5
  Reduce by 1: start -> start expr  . 


State 3:
3: expr -> errTok_  . 
  Reduce by 3: expr -> errTok_  . 


State 4:
4: expr -> NR  . 
  Reduce by 4: expr -> NR  . 


State 5:
5: expr -> expr '+'  . expr 
  0:   On expr to state 6
  1:   On errTok_ to state 3
  2:   On NR to state 4


State 6:
5: expr -> expr '+' expr  . 
5: expr -> expr  . '+' expr 
  0 (removed by precedence):   On '+' to state 5
  Reduce by 5: expr -> expr '+' expr  . 
    )

    
The following output from the tt(parse()) function, generated by b() using the
tt(--debug) option illustrates error recovery for the above grammar, entering
the input 
    verb(
    a
    3+a
    )
    results in:
        verb(
    parse(): Parsing starts
    
PUSH 0 (initializing the state stack)
    

LOOKUP: [0, 'Reserved_::UNDETERMINED_'] -> default reduce using rule 2
    
REDUCE: rule 2
    execute action 2 ...
    ... completed
    rule 2: pop 0 elements. Next will be: [0, 'start']
    pop 0 elements from the stack having size 1
    next: [0, 'start']
    

PUSH:   [0, 'start'] -> 1
    
scanner token `a' (97)
    
ERROR:  [1, `a' (97)] -> ??. Errors: 1
Syntax error
    Reached ERROR state 1
    
PUSH:   [1, 'errTok_'] -> 3
    

LOOKUP: [3, `a' (97)] -> default reduce using rule 3
    
REDUCE: rule 3
    rule 3: pop 1 elements. Next will be: [1, 'expr']
    pop 1 elements from the stack having size 3
    next: [1, 'expr']
    
available token 'expr'
    
PUSH:   [1, 'expr'] -> 2
    
available token `a' (97)
    
LOOKUP: [2, `a' (97)] -> default reduce using rule 1
    
REDUCE: rule 1
    rule 1: pop 2 elements. Next will be: [0, 'start']
    pop 2 elements from the stack having size 3
    next: [0, 'start']
    

PUSH:   [0, 'start'] -> 1
    
available token `a' (97)
    
scanner token 'NR'
    
PUSH:   [1, 'NR'] -> 4
    ERROR RECOVERED: next state 4
    

LOOKUP: [4, 'Reserved_::UNDETERMINED_'] -> default reduce using rule 4
    
REDUCE: rule 4
    execute action 4 ...
    ... completed
    rule 4: pop 1 elements. Next will be: [1, 'expr']
    pop 1 elements from the stack having size 3
    next: [1, 'expr']
    
available token 'expr'
    
PUSH:   [1, 'expr'] -> 2
    
scanner token 'Reserved_::EOF_'
    
LOOKUP: [2, 'Reserved_::EOF_'] -> default reduce using rule 1
    
REDUCE: rule 1
    execute action 1 ...
    ... completed
    rule 1: pop 2 elements. Next will be: [0, 'start']
    pop 2 elements from the stack having size 3
    next: [0, 'start']
    

PUSH:   [0, 'start'] -> 1
    
available token 'Reserved_::EOF_'
    ACCEPT(): Parsing successful
    parse(): returns 0 or 1
    )

    The final debug message should be interpreted as a bf(C++) expression: the
0 indicates that tt(Parse::ACCEPT) rather than tt(Parser::ABORT) was called,
the 1 shows the value of tt(d_nErrors_). Consequently, tt(parse()) returns 1
(i.e., tt(`0 or 1')).