1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html40/loose.dtd">
<HTML>
<!-- Created on January, 28 2005 by texi2html 1.66 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Many creative people <dev@texi2html.cvshome.org>
Send bugs and suggestions to <users@texi2html.cvshome.org>
-->
<HEAD>
<TITLE>Bison 2.21.5: Error Recovery</TITLE>
<META NAME="description" CONTENT="Bison 2.21.5: Error Recovery">
<META NAME="keywords" CONTENT="Bison 2.21.5: Error Recovery">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.66">
</HEAD>
<BODY LANG="en" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<A NAME="SEC80"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_8.html#SEC79"> < </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_10.html#SEC81"> > </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_8.html#SEC67"> << </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_10.html#SEC81"> >> </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_15.html#SEC92">Index</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1> 6. Error Recovery </H1>
<!--docid::SEC80::-->
<P>
It is not usually acceptable to have a program terminate on a parse
error. For example, a compiler should recover sufficiently to parse the
rest of the input file and check it for errors; a calculator should accept
another expression.
</P>
<P>
In a simple interactive command parser where each input is one line, it may
be sufficient to allow <CODE>yyparse</CODE> to return 1 on error and have the
caller ignore the rest of the input line when that happens (and then call
<CODE>yyparse</CODE> again). But this is inadequate for a compiler, because it
forgets all the syntactic context leading up to the error. A syntax error
deep within a function in the compiler input should not cause the compiler
to treat the following line like the beginning of a source file.
</P>
<P>
<A NAME="IDX41"></A>
You can define how to recover from a syntax error by writing rules to
recognize the special token <CODE>error</CODE>. This is a terminal symbol that
is always defined (you need not declare it) and reserved for error
handling. The Bison parser generates an <CODE>error</CODE> token whenever a
syntax error happens; if you have provided a rule to recognize this token
in the current context, the parse can continue.
</P>
<P>
For example:
</P>
<P>
<TABLE><tr><td> </td><td class=example><pre>stmnts: /* empty string */
| stmnts '\n'
| stmnts exp '\n'
| stmnts error '\n'
</pre></td></tr></table><P>
The fourth rule in this example says that an error followed by a newline
makes a valid addition to any <CODE>stmnts</CODE>.
</P>
<P>
What happens if a syntax error occurs in the middle of an <CODE>exp</CODE>? The
error recovery rule, interpreted strictly, applies to the precise sequence
of a <CODE>stmnts</CODE>, an <CODE>error</CODE> and a newline. If an error occurs in
the middle of an <CODE>exp</CODE>, there will probably be some additional tokens
and subexpressions on the stack after the last <CODE>stmnts</CODE>, and there
will be tokens to read before the next newline. So the rule is not
applicable in the ordinary way.
</P>
<P>
But Bison can force the situation to fit the rule, by discarding part of
the semantic context and part of the input. First it discards states and
objects from the stack until it gets back to a state in which the
<CODE>error</CODE> token is acceptable. (This means that the subexpressions
already parsed are discarded, back to the last complete <CODE>stmnts</CODE>.) At
this point the <CODE>error</CODE> token can be shifted. Then, if the old
look-ahead token is not acceptable to be shifted next, the parser reads
tokens and discards them until it finds a token which is acceptable. In
this example, Bison reads and discards input until the next newline
so that the fourth rule can apply.
</P>
<P>
The choice of error rules in the grammar is a choice of strategies for
error recovery. A simple and useful strategy is simply to skip the rest of
the current input line or current statement if an error is detected:
</P>
<P>
<TABLE><tr><td> </td><td class=example><pre>stmnt: error ';' /* on error, skip until ';' is read */
</pre></td></tr></table><P>
It is also useful to recover to the matching close-delimiter of an
opening-delimiter that has already been parsed. Otherwise the
close-delimiter will probably appear to be unmatched, and generate another,
spurious error message:
</P>
<P>
<TABLE><tr><td> </td><td class=example><pre>primary: '(' expr ')'
| '(' error ')'
<small>...</small>
;
</pre></td></tr></table><P>
Error recovery strategies are necessarily guesses. When they guess wrong,
one syntax error often leads to another. In the above example, the error
recovery rule guesses that an error is due to bad input within one
<CODE>stmnt</CODE>. Suppose that instead a spurious semicolon is inserted in the
middle of a valid <CODE>stmnt</CODE>. After the error recovery rule recovers
from the first error, another syntax error will be found straightaway,
since the text following the spurious semicolon is also an invalid
<CODE>stmnt</CODE>.
</P>
<P>
To prevent an outpouring of error messages, the parser will output no error
message for another syntax error that happens shortly after the first; only
after three consecutive input tokens have been successfully shifted will
error messages resume.
</P>
<P>
Note that rules which accept the <CODE>error</CODE> token may have actions, just
as any other rules can.
</P>
<P>
<A NAME="IDX42"></A>
You can make error messages resume immediately by using the macro
<CODE>yyerrok</CODE> in an action. If you do this in the error rule's action, no
error messages will be suppressed. This macro requires no arguments;
`<SAMP>yyerrok;</SAMP>' is a valid C statement.
</P>
<P>
<A NAME="IDX43"></A>
The previous look-ahead token is reanalyzed immediately after an error. If
this is unacceptable, then the macro <CODE>yyclearin</CODE> may be used to clear
this token. Write the statement `<SAMP>yyclearin;</SAMP>' in the error rule's
action.
</P>
<P>
For example, suppose that on a parse error, an error handling routine is
called that advances the input stream to some point where parsing should
once again commence. The next symbol returned by the lexical scanner is
probably correct. The previous look-ahead token ought to be discarded
with `<SAMP>yyclearin;</SAMP>'.
</P>
<P>
<A NAME="IDX44"></A>
The macro <CODE>YYRECOVERING</CODE> stands for an expression that has the
value 1 when the parser is recovering from a syntax error, and 0 the
rest of the time. A value of 1 indicates that error messages are
currently suppressed for new syntax errors.
</P>
<P>
<A NAME="Context Dependency"></A>
<HR SIZE="6">
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_8.html#SEC67"> << </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_10.html#SEC81"> >> </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT"> <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_15.html#SEC92">Index</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="bison_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Frank B. Brokken</I> on <I>January, 28 2005</I>
using <A HREF="http://texi2html.cvshome.org"><I>texi2html</I></A>
</FONT>
</BODY>
</HTML>
|