1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165
|
B() implements a simple error recovery mechanism. When the tt(lookup())
function cannot find an action for the current token in the current state it
throws an tt(UNEXPECTED_TOKEN__) exception.
This exception is caught by the parsing function, calling the
tt(errorRecovery()) member function. By default, this member function will
terminates the parsing process. The non-default recovery procedure is
available once an tt(error) token is used in a production rule. When the
parsing process throws bf(UNEXPECTED_TOKEN__) the recovery procedure is
started (i.e., it is started whenever a syntactical error is encountered or
tt(ERROR()) is called).
The recovery procedure consists of
itemization(
it() looking for the first state on the state-stack having an
error-production, followed by:
it() handling all state transitions that are possible without retrieving a
terminal token.
it() then, in the state requiring a terminal token and starting
with the initial unexpected token (3) all subsequent terminal tokens are
ignored until a token is retrieved which is a continuation token in that
state.
)
If the error recovery procedure fails (i.e., if no acceptable token is
ever encountered) error recovery falls back to the default recovery
mode (i.e., the parsing process is terminated).
Not all syntactic errors are always reported: the option
link(--required-tokens)(REQUIRED) can be used to specify the minimum number of
tokens that must have been successfully processed before another syntactic
error will be reported (and counted).
The option link(--error-verbose)(ERRORVERBOSE) may be specified to obtain the
contents of the state stack when a syntactic error is reported.
The example grammar may be provided with an tt(error) production rule:
verb(
%token NR
%left '+'
%%
start:
start expr
|
// empty
;
expr:
error
|
NR
|
expr '+' expr
;
)
The resulting grammar has one additional state (handling the error
production) and one state in which the tt(ERR_ITEM) flag has been set. When
and error is encountered, this state will obtain tokens until a token having a
valid continuation is obtained, after which normal processing continues.
The following output from the tt(parse()) function, generated by b() using the
tt(--debug) option illustrates error recovery for the above grammar, entering
the input
verb(
a
3 + a
)
The program defining the parser and calling the parsing member was:
verbinclude(../algorithm/example/demo.cc)
For this example the following implementation of the tt(lex()) member
was used:
verb(
int Parser::lex()
{
std::string word;
std::cin >> word;
if (std::cin.eof())
return 0;
if (isdigit(word[0]))
return NR;
return word[0];
}
)
subsubsect(Error recovery --debug output)
verb(
parse(): Parsing starts
push(state 0)
==
lookup(0, `_UNDETERMINED_'): default reduction by rule 2
executeAction(): of rule 2 ...
... action of rule 2 completed
pop(0) from stack having size 1
pop(): next state: 0, token: `start'
reduce(): by rule 2 to N-terminal `start'
==
lookup(0, `start'): shift 1 (`start' processed)
push(state 1)
==
a
Syntax error
nextToken(): using `a' (97)
lookup(1, `a' (97)): Not found. Start error recovery.
errorRecovery(): 1 error(s) so far. State = 1
errorRecovery(): state 1 is an ERROR state
lookup(1, `_error_'): shift 3 (`_error_' processed)
push(state 3)
lookup(3, `a' (97)): default reduction by rule 3
pop(1) from stack having size 3
pop(): next state: 1, token: `expr'
reduce(): by rule 3 to N-terminal `expr'
errorRecovery() REDUCE by rule 3, token = `expr'
lookup(1, `expr'): shift 2 (`expr' processed)
push(state 2)
errorRecovery() SHIFT state 2, continue with `a' (97)
lookup(2, `a' (97)): default reduction by rule 1
pop(2) from stack having size 3
pop(): next state: 0, token: `start'
reduce(): by rule 1 to N-terminal `start'
errorRecovery() REDUCE by rule 1, token = `start'
lookup(0, `start'): shift 1 (`start' processed)
push(state 1)
errorRecovery() SHIFT state 1, continue with `a' (97)
lookup(1, `a' (97)): Not found. Continue error recovery.
3+a
nextToken(): using `NR'
lookup(1, `NR'): shift 4 (`NR' processed)
push(state 4)
errorRecovery() SHIFT state 4, continue with `_UNDETERMINED_'
errorRecovery() COMPLETED: next state 4, no token yet
==
lookup(4, `_UNDETERMINED_'): default reduction by rule 4
executeAction(): of rule 4 ...
... action of rule 4 completed
pop(1) from stack having size 3
pop(): next state: 1, token: `expr'
reduce(): by rule 4 to N-terminal `expr'
==
lookup(1, `expr'): shift 2 (`expr' processed)
push(state 2)
==
[input terminated here]
nextToken(): using `_EOF_'
lookup(2, `_EOF_'): default reduction by rule 1
executeAction(): of rule 1 ...
... action of rule 1 completed
pop(2) from stack having size 3
pop(): next state: 0, token: `start'
reduce(): by rule 1 to N-terminal `start'
==
lookup(0, `start'): shift 1 (`start' processed)
push(state 1)
==
lookup(1, `_EOF_'): ACCEPT
ACCEPT(): Parsing successful
parse(): returns 0
)
|