1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
|
B() implements a simple error recovery mechanism. When the tt(lookup())
function cannot find an action for the current token in the current state it
throws an tt(UNEXPECTED_TOKEN_) exception.
This exception is caught by the parsing function, calling the
tt(errorRecovery()) member function. By default, this member function
terminates the parsing process. The non-default recovery procedure is
available once an tt(error) token is used in a production rule. When the
parsing process throws bf(UNEXPECTED_TOKEN_) the recovery procedure is
started (i.e., it is started whenever a syntactical error is encountered or
tt(ERROR()) is called).
The recovery procedure consists of
itemization(
it() looking for the first state on the state-stack having an
error-production, followed by:
it() handling all state transitions that are possible without retrieving a
terminal token.
it() then, in the state requiring a terminal token and starting
with the initial unexpected token (3) all subsequent terminal tokens are
ignored until a token is retrieved which is a continuation token in that
state.
)
If the error recovery procedure fails (i.e., if no acceptable token is
ever encountered) error recovery falls back to the default recovery
mode (i.e., the parsing process is terminated).
Not all syntactic errors are always reported: the option
link(--required-tokens)(REQUIRED) can be used to specify the minimum number of
tokens that must have been successfully processed before another syntactic
error is reported (and counted).
The option link(--error-verbose)(ERRORVERBOSE) may be specified to obtain the
contents of the state stack when a syntactic error is reported.
The example grammar may be provided with an tt(error) production rule:
verb(
%token NR
%left '+'
%%
start:
start expr
|
// empty
;
expr:
error
|
NR
|
expr '+' expr
;
)
The resulting grammar has one additional state (handling the error
production) and one state in which the tt(ERR_ITEM) flag has been set. When
and error is encountered, this state obtains tokens until a token having a
valid continuation was received, after which normal processing continues.
In the parser's verbose output (using the tt(-V) option) the various grammar
rules and state transition tables are shown. The debug output below refers to
this information.
The rules are:
verb(
1: start -> start expr
2: start -> <empty>
3: expr (errTok_) -> errTok_
4: expr (NR) -> NR
5: expr ('+') -> expr '+' expr
6: start_$ -> start
)
The state-transitions are:
verb(
State 0:
6: start_$ -> . start
0: On start to state 1
Reduce by 2: start -> .
State 1:
6: start_$ -> start .
1: start -> start . expr
0: On expr to state 2
1: On errTok_ to state 3
2: On NR to state 4
State 2:
1: start -> start expr .
5: expr -> expr . '+' expr
0: On '+' to state 5
Reduce by 1: start -> start expr .
State 3:
3: expr -> errTok_ .
Reduce by 3: expr -> errTok_ .
State 4:
4: expr -> NR .
Reduce by 4: expr -> NR .
State 5:
5: expr -> expr '+' . expr
0: On expr to state 6
1: On errTok_ to state 3
2: On NR to state 4
State 6:
5: expr -> expr '+' expr .
5: expr -> expr . '+' expr
0 (removed by precedence): On '+' to state 5
Reduce by 5: expr -> expr '+' expr .
)
The following output from the tt(parse()) function, generated by b() using the
tt(--debug) option illustrates error recovery for the above grammar, entering
the input
verb(
a
3+a
)
results in:
verb(
parse(): Parsing starts
PUSH 0 (initializing the state stack)
LOOKUP: [0, 'Reserved_::UNDETERMINED_'] -> default reduce using rule 2
REDUCE: rule 2
execute action 2 ...
... completed
rule 2: pop 0 elements. Next will be: [0, 'start']
pop 0 elements from the stack having size 1
next: [0, 'start']
PUSH: [0, 'start'] -> 1
scanner token `a' (97)
ERROR: [1, `a' (97)] -> ??. Errors: 1
Syntax error
Reached ERROR state 1
PUSH: [1, 'errTok_'] -> 3
LOOKUP: [3, `a' (97)] -> default reduce using rule 3
REDUCE: rule 3
rule 3: pop 1 elements. Next will be: [1, 'expr']
pop 1 elements from the stack having size 3
next: [1, 'expr']
available token 'expr'
PUSH: [1, 'expr'] -> 2
available token `a' (97)
LOOKUP: [2, `a' (97)] -> default reduce using rule 1
REDUCE: rule 1
rule 1: pop 2 elements. Next will be: [0, 'start']
pop 2 elements from the stack having size 3
next: [0, 'start']
PUSH: [0, 'start'] -> 1
available token `a' (97)
scanner token 'NR'
PUSH: [1, 'NR'] -> 4
ERROR RECOVERED: next state 4
LOOKUP: [4, 'Reserved_::UNDETERMINED_'] -> default reduce using rule 4
REDUCE: rule 4
execute action 4 ...
... completed
rule 4: pop 1 elements. Next will be: [1, 'expr']
pop 1 elements from the stack having size 3
next: [1, 'expr']
available token 'expr'
PUSH: [1, 'expr'] -> 2
scanner token 'Reserved_::EOF_'
LOOKUP: [2, 'Reserved_::EOF_'] -> default reduce using rule 1
REDUCE: rule 1
execute action 1 ...
... completed
rule 1: pop 2 elements. Next will be: [0, 'start']
pop 2 elements from the stack having size 3
next: [0, 'start']
PUSH: [0, 'start'] -> 1
available token 'Reserved_::EOF_'
ACCEPT(): Parsing successful
parse(): returns 0 or 1
)
The final debug message should be interpreted as a bf(C++) expression: the
0 indicates that tt(Parse::ACCEPT) rather than tt(Parser::ABORT) was called,
the 1 shows the value of tt(d_nErrors_). Consequently, tt(parse()) returns 1
(i.e., tt(`0 or 1')).
|