1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362
|
DELETEMACRO(file)
NOUSERMACRO( setDebug lex yylex matched YYText file exceptionHandler lookup
error)
includefile(../../release.yo)
htmlstyle(body)(color: #27408B; background: #FFFAF0)
whenhtml(mailto(Frank B. Brokken: f.b.brokken@rug.nl))
DEFINEMACRO(lsoption)(3)(\
bf(--ARG1)=tt(ARG3) (bf(-ARG2))\
)
DEFINEMACRO(laoption)(2)(\
bf(--ARG1)=tt(ARG2)\
)
DEFINEMACRO(loption)(1)(\
bf(--ARG1)\
)
DEFINEMACRO(soption)(1)(\
bf(-ARG1)\
)
DEFINEMACRO(itx)(0)()
DEFINEMACRO(itemlist)(1)(ARG1)
DEFINEMACRO(bic)(0)(bf(bisonc++))
DEFINEMACRO(b)(0)(bf(bisonc++))
DEFINEMACRO(Bic)(0)(bf(Bisonc++))
DEFINEMACRO(Cpp)(0)(bf(C++))
DEFINEMACRO(prot)(0)(tt((prot)))
DEFINEMACRO(itt)(1)(it() tt(ARG1))
DELETEMACRO(tt)
DEFINEMACRO(tt)(1)(em(ARG1))
DEFINEMACRO(manpagetext)(0)()
COMMENT( man-request, section, date, distribution file, general name)
manpage(bisonc++)(1)(_CurYrs_)(bisonc++._CurVers_)
(bisonc++ parser generator)
COMMENT( man-request, larger title )
manpagename(bisonc++)(Generate a C++ parser class and parsing function)
COMMENT( all other: add after () )
manpagesynopsis()
bf(bisonc++) [OPTIONS] tt(grammar-file)
manpagesection(DESCRIPTION)
Bic() derives from previous work on bf(bison) by Alain Coetmeur
(coetmeur@icdc.fr), who created in the early '90s a Cpp() class encapsulating
the tt(yyparse) function as generated by the GNU-bf(bison) parser generator.
Initial versions of bic() (up to version 0.92) wrapped Alain's program in
a program offering a more modern user-interface, removing all old-style
(bf(C)) tt(%define) directives from bf(bison++)'s input specification file
(see below for an in-depth discussion of the differences between bf(bison++)
and bic()). Starting with version 0.98, bic() is a complete rebuilt of
the parser generator, closely following descriptions given in Aho, Sethi and
Ullman's em(Dragon Book). Since version 0.98 bic() is a Cpp() program, rather
than a bf(C) program generating bf(C++) code.
Bic() expands the concepts initially implemented in bf(bison) and
bf(bison++), offering a cleaner setup of the generated parser class. The
parser class is derived from a base-class, mainly containing the parser's
token- and type-definitions as well as several member functions which should
not be modified by the programmer.
Most of these base-class members might also be defined directly in the
parser class, but were defined in the parser's base-class. This design results
in a very lean parser class, declaring only members that are actually defined
by the programmer or that have to be defined by bic() itself (e.g., the
member function tt(parse) as well as some support functions requiring access
to facilities that are only available in the parser class itself, rather than
in the parser's base class).
This design does not require any virtual members: the members which are
not involved in the actual parsing process may always be (re)implemented
directly by the programmer. Thus there is no need to apply or define virtual
member functions.
Before version 5.00.00 bic() offered one single manual page. The advantage
of one man-page is of course that you never have to look for which manual page
contains which information. But on the other hand, bic()'s man-page grew into
a huge man-page of about 2000 lines in which it was hard to find your
way. From release 5.00.00 onward, three man-pages. The index below relates
manual pages to their specific contents.
bf(Overview of the contents of bisonc++'s man-pages)
This man-page concentrates on the tt(bisonc++) program itself, offering
the following sections:
itemization(
it() bf(DESCRIPTION): a short description of bic() and its roots;
it() bf(FROM BISONC++ < 6.00.00 TO BISONC++ >= 6.00.00): required
modifications when re-generating parsers;
it() bf(OPTIONS): options supported by bic().
it() bf(QUICK START): a quick start overview about how to use bic();
it() bf(GENERATED FILES): files generated by bic() and their purposes
it() bf(FILES): skeleton) files used by bic();
it() bf(SEE ALSO): references to other programs and documentation;
it() bf(BUGS): some additional stuff that should not qualify as bugs.
it() bf(ABOUT bisonc++): Some history;
it() bf(AUTHOR): at the end of this man-page.
)
The bf(bisonc++input)(7) man-page covers the details of the
grammar-specification file. This man-page offers these sections:
itemization(
it() bf(DESCRIPTION): a short description of bic() and its grammar
file(s);
it() bf(DIRECTIVES): bic()'s grammar-specification directives;
it() bf(POLYMORPHIC SEMANTIC VALUES): how to use polymorphic semantic
values in parsers generated by bic();
it() bf(DOLLAR NOTATIONS): available $-shorthand notations with single,
union, and polymorphic semantic value types.
it() bf(RESTRICTIONS ON TOKEN NAMES): name restrictions for user-defined
symbols;
it() bf(OBSOLETE SYMBOLS): symbols available to bf(bison)(1), but not
to bic();
it() bf(USING SYMBOLIC TOKENS IN CLASSES OTHER THAN THE PARSER CLASS);
how to refer to tokens defined in the grammar;
it() bf(EXAMPLE): an example of using bic();
it() bf(SEE ALSO): references to other programs and documentation;
it() bf(AUTHOR): at the end of this man-page.
)
The bf(bisonc++api)(3) describes the application programmer's
interface, containing these sections:
itemization(
it() bf(DESCRIPTION): a short description of bic() and its application
programmer's interface;
it() bf(PUBLIC SYMBOLS): constructor, enums, members, and types that can
be used by calling software;
it() bf(PRIVATE ENUMS AND -TYPES): enumerations and types only
available to the tt(Parser) class;
it() bf(PRIVATE MEMBER FUNCTIONS): member functions that are only
available to the tt(Parser) class;
it() bf(PRIVATE DATA MEMBERS): data members that are only available to
the tt(Parser) class;
it() bf(TYPES AND VARIABLES IN THE ANONYMOUS NAMESPACE): an overview of
the types and variables that are used to define and store the
grammar-tables generated by bic();
it() bf(SEE ALSO): references to other programs and documentation;
it() bf(AUTHOR): at the end of this man-page.
)
manpagesection(FROM BISONC++ < 6.00.00 TO BISONC++ >= 6.00.00)
This section is only relevant when re-generating parser code previously
generated by bic() versions before 6.00.00.
Bic() version 6.00.00 generates code that significantly differs from code
generated by earlier versions. The identifiers of all members (both data and
functions) that are generated by bic() and accessible to the generated
parser-class end in an underscore character. Member functions whose
identifiers end in an underscore are `owned' by bic(), are rewritten each
time bic() is run, and should not be modified. Some members are defined as
members of the generated parser-class, and are declared in the parser class
header file (e.g., tt(parser.h)) and some members are given default
implementations in the parser's internal header file (e.g.,
tt(parser.ih)). Once generated, these files are left alone by
bic(). Therefore, when using bic() version 6.00.00 or beyond to re-generate a
parser which was originally generated by an earlier bic() version, the
existing parser header and internal header files need some minor
modifications:
itemization(
itt(void error(char const *)) was changed to tt(void error()). A
default implementation is provided in the parser's internal header file. The
current implementation directly inserts the text tt(Syntax error) into the
standard output stream;
itt(void exceptionHandler_(std::exception const &exc)) was changed to
tt(void exceptionHandler(std::exception const &exc)). A
default implementation is provided in the parser's internal header file, and
only its trailing underscore characters need to be removed;
itt(int lookup(bool recovery)): remove this member declaration from the
previously generated parser class;
it() The following members are declared without a trailing underscore
character in the generated parser class. An underscore character should
be added to their identifiers: tt(executeAction, errorRecovery, nextToken).
it() The member tt(void nextCycle_()) must be declared in the private
section of the generated parser class.
)
Previously, several data members of the parser's base class were directly
accessible to the parser class. Bic() version 6.00.00 restricts access to
those members. They can still be read, but no longer modified by the parser
class. This applies to the following members:
itemization(
itt(d_token_): use tt(int token_()) instead;
itt(d_state_): use tt(size_t state_()) instead;
)
manpagesection(OPTIONS)
includefile(../manual/invoking/options.yo)
manpagesection(QUICK START)
Bic() may be used as follows:
itemization(
it() First, define a plain grammar: no actions, just the rules. Refer to
bic()'s manual and other sources (like Aho, Sethi and Ullman's book) for
details about how to define and decorate grammars.
it() Having defined the grammar and (usually) some directives (advice:
always use the tt(%oken-path) directive) bic() is
run, generating the essential elements of a parser class. See the next section
for details about the files generated by bic().
it() No `macro style' tt(%define) declarations are required. Instead, the
normal practice of defining class members in source files and declaring them
in class header files can be followed when using bic(). Bic() concentrates on
its main tasks: defining a parser class and implementing the parsing function
tt(int parse), leaving all other parts of the parser class' definition to the
programmer.
it() Next, members required in addition to the bic()-generated member
tt(parse) and its support functions must be implemented by the programmer, and
declared in the parser's class header. At the very least a member tt(int lex)
must be defined (a default implementation can be generated by bic()).
it() The generated parser can now be used in a program. A very simple
example would be:
verb(
int main()
{
Parser parser;
return parser.parse();
}
)
)
manpagesection(GENERATED FILES)
Bic() may create the following files:
itemization(
it() A file containing the implementation of the member function tt(parse)
and its support functions. The member tt(parse) is a public member that can be
called to parse a token-sequence according to a specified LALR1 type of
grammar. By default the implementations of these members are written on the
file tt(parse.cc). The programmer should not modify the contents of this file;
it is rewritten every time bic() is called.
it() A file containing an initial setup of the parser class, containing
the declaration of the public member tt(parse) and of its (private) support
members. New members may safely be declared in the parser class, as it is only
created by bic() if not yet existing, using the filename tt(<parser-class>.h)
(where tt(<parser-class>) is the the name of the defined parser class).
it() A file containing the parser class' em(base class). This base
class should not be modified by the programmer. It contains types defined by
bic(), as well as several (protected) data members and member functions, which
should not be redefined by the programmer. All symbolic parser terminal tokens
are defined in this class, thereby escalating these definitions to a separate
class (cf. Lakos, (2001)), which in turn prevents circular dependencies
between the lexical scanner and the parser (here, circular dependencies may
easily be encountered, as the parser needs access to the lexical scanner class
when defining the lexical scanner as one of its data members, whereas the
lexical scanner needs access to the parser class to know about the grammar's
symbolic terminal tokens; escalation is a way out of such circular
dependencies). By default this file is (re)written any time bic() is called,
using the filename tt(<parser-class>base.h).
it() A file containing an em(implementation header). The
implementation header rather than the parser's class header file should be
included by the parser's source files implementing member functions declared
by the programmer. The implementation header first includes the parser class's
header file, and then provides default in-line implementations for its members
tt(error) and tt(print) (which may be altered by the programmer). The member
tt(lex) may also receive a standard in-line implementation. Alternatively, its
implementation can be provided by the programmer (see below). Any directives
and/or namespace directives required for the proper compilation of the
parser's additional member functions should be declared next. The
implementation header is included by the file defining tt(parse). By default
the implementation header is created if not yet existing, receiving the
filename tt(<parser-class>.ih).
it() A verbose description of the generated parser. This file is
comparable to the verbose output file originally generated by bf(bison++). It
is generated when the option tt(--verbose) or tt(-V) is provided. If so, bic()
writes the file tt(<grammar>.output), where tt(<grammar>) is the name of the
file containing the grammar definition.
)
manpagesection(FILES)
itemization(
it() bf(bisonc++base.h): skeleton of the parser's base class;
it() bf(bisonc++.h): skeleton of the parser class;
it() bf(bisonc++.ih): skeleton of the implementation header;
it() bf(bisonc++.cc): skeleton of the member tt(parse);
it() bf(bisonc++polymorphic): skeleton of the declarations used by
tt(%polymorphic);
it() bf(bisonc++polymorphic.code): skeleton of the non-inline
implementations of the members declared in bf(bisonc++polymorphic).
it() bf(debugdecl.in): skeleton declaring members of the parser's base
class that are only required when the tt(debug) option or directive
was specified.
it() bf(debugfunctions1.in): skeleton defining the members declared in
tt(debugdecl.in).
it() bf(debugfunctions2.in): skeleton implementing tt(symbol_), which is
only needed when the tt(print-tokens) option or directive was
specified.
it() bf(debugfunctions3.in): skeleton implementing tt(errorVerbose_),
which is only needed when the tt(error-verbose) option or directive was
specified.
it() bf(debugincludes.in): skeleton specifying the header files
tt(#include) directives that are required when the tt(debug) option
or directive was specified.
it() bf(debuglookup.in): skeleton containing extra code required in the
tt(Parser::lookup) member when the tt(debug) option of directive was
specified.
it() bf(lex.in): skeleton implementing the tt(Parser::lex) function.
it() bf(ltypedata.in): skeleton declaring the location variables
it() bf(ltype.in): skeleton defining the default or user defined
tt(LTYPE_).
it() bf(print.in): skeleton implementing the actions of tt(Parser::print)
if the tt(print-tokens) option or directive was specified.
)
manpagesection(SEE ALSO)
DEFINESYMBOL(manalso)(bf(bisonc++api)(3), bf(bisonc++input)(7))
includefile(seealso.yo)
manpagesection(BUGS)
Parser-class header files (e.g., Parser.h) and parser-class internal
header files (e.g., Parser.ih) generated with bisonc++ < 6.00.00 require
several minor hand-modifications when re-generating the parser with bic()
>= 6.00.00. See the earlier section bf(FROM BISONC++ < 6.00.00 TO BISONC++
>= 6.00.00) for details.
To avoid collisions with names defined by the parser's (base) class, the
following identifiers should not be used as token names:
itemization(
it() Identifiers ending in an underscore;
it() Any of the following identifiers: tt(ABORT, ACCEPT, ERROR,
debug, error), or tt(setDebug).
)
manpagesection(ABOUT bisonc++)
bf(Bisonc++) was based on bf(bison++), originally developed by Alain
Coetmeur (coetmeur@icdc.fr), R&D department (RDT), Informatique-CDC, France,
who based his work on bf(bison), GNU version 1.21.
Bic() version 0.98 and beyond is a complete rewrite of an LALR-1 parser
generator, closely following the construction process as described in Aho,
Sethi and Ullman's (1986) book bf(Compilers) (i.e., the em(Dragon book)). It
uses the same grammar specification as bf(bison) and bf(bison++), and it uses
practically the same options and directives as bic() versions earlier than
0.98. Variables, declarations and macros that are obsolete were removed.
Compared to tt(bison) and tt(bison++), the number and functions of the
various tt(%define) declarations was thoroughly modified. All of
tt(bison's %define) declarations were replaced by their (former) first
arguments. Furthermore, `macro-style' declarations are not supported or
required. Finally, all directives only use lower-case characters and do not
contain underscore characters (but sometimes hyphens). E.g., tt(%define DEBUG)
is now declared as tt(%debug); tt(%define LSP_NEEDED) is now declared as
tt(%lsp-needed) (note the hyphen).
manpageauthor()
Frank B. Brokken (f.b.brokken@rug.nl).
|