1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
|
In tt(mfcalc), the parser's member function tt(lex()) must now recognize
variables, function names, numeric values, and the single-character arithmetic
operators. Strings of alphanumeric characters not starting with a digit are
recognized as either variables or functions depending on the table in which
they are found. By arranging tt(lex)'s logic such that the function table is
searched first it is simple to ensure that no variable can ever have the name
of a predefined function. The implementation used here, in which two
different tables are used for the arithmetic functions and the variable
symbols is appealing because it's simple to implement. However, it also has
the drawback of being difficult to scale to more generic calculators, using,
e.g., different data types and different types of functions. In such
situations a single symbol table is more preferable, where the keys are the
identifiers (variables, function names, predefined constants, etc.) while the
values are objects describing their characteristics. A re-implementation of
tt(mfcalc) using an integrated symbol table is suggested in one of the
exercises of the next section ref(EXERCISES).
The parser's tt(lex) member has these characteristics:
itemization(
it() All leading blanks and tabs are skipped
it() If no (other) character could be obtained 0 is returned, indicating
End-Of-File.
it() If the first non-blank character is a dot or number, a number is
extracted from the standard input. Since the semantic value data
member of tt(mfcalc)'s parser (tt(d_val)) is itself also a tt(union),
the numerical value can be extracted into tt(d_val_.u_val), and a
tt(NUM) token can be returned.
it() If the first non-blank character is not a letter, then a
single-character token was received and the character's value is
returned as the next token.
it() Otherwise the read character is a letter. This character and all
subsequent alpha-numeric characters are extracted to construct the
name of an identifier. Then this identifier is searched for in the
tt(s_functions) map. If found, tt(d_val_.u_fun) is given the function's
address, found as the value of the tt(s_functions) map element
corresponding to the read identifier, and token tt(FNCT) is returned.
If the symbol is not found in tt(s_functions) the address of the
value ofn tt(d_symbols) associated with the received identifier is
assigned to tt(d_val_.u_symbol) and token tt(VAR) is returned. Note
that this automatically defines newly used variables, since
tt(d_symbols[name]) automatically inserts a new element in a map if
tt(d_symbol[name]) wasn't already there.
)
Here is the parser's tt(lex) member function:
verbinclude(mfcalc/parser/lex.cc)
|