File: Parser%20generation%20course.txt

package info (click to toggle)
mysql-workbench 5.2.40%2Bdfsg-2
links: PTS, VCS
area: main
in suites: wheezy
size: 53,880 kB
sloc: cpp: 419,850; yacc: 74,784; xml: 54,510; python: 31,455; sh: 9,423; ansic: 4,736; makefile: 2,442; php: 529; java: 237
file content (59 lines) | stat: -rw-r--r-- 3,277 bytes

This file contains a concise overview of how the WB parser is generated.

1) Prerequisites: get lex.h and sql_yacc_5_5_5_m3.yy (or whatever is the current server version) from the server source repository. Store lex.h in sql-parser/include and the yacc file in sql-parser/yy_purify-tool.
2) Use yy_purify-tool to strip server specific semantic from this grammar file, which creates a prf file (purified). On Windows use the yy_purify_5_5_5_m3.bat file to do this job.
3) Manually do some changes in the raw prf file, creating a tailored prf in this process. Compare existing prf files for what you have to change before you remove them.
4) Copy the tailored prf to the yy_gen-tool folder.
5) Use the gen tool to create a new grammar file which then also contains the semantic for WB. On Windows you can use yy_gen_myx_sql_parser_rules_5_5_5_m3.bat to do this job.
6) Another customization step is now needed to. Compare the existing raw.yy and tailored.yy files for the changes to make. It might also be necessary to repeat steps 6-8 several times to adjust the grammar a bit to make the following generation step run through succesfully.
7) Copy yy_gen-tool/sql_yacc_5_5_5_m3_tailored.yy to source/myx_sql_parser.yy.
7a) modify source/myx_sql_parser.yy
-    | WITH_CUBE_SYM
+    | WITH CUBE_SYM
       {
         if (mysql_parser::SqlAstStatics::is_ast_generation_enabled)
         {
           $$= mysql_parser::set_ast_node_name($1, sql::_olap_opt);
         }
       }
-    | WITH_ROLLUP_SYM
+    | WITH ROLLUP_SYM

8) Use Bison to generate the parser from source/myx_sql_parser.yy. To simplify this process batch files for Linux and Windows exist (see gen_grammar[.bat]).
8a)
find and replace
     | DOUBLE_SYM PRECISION
with
    | DOUBLE_SYM precision

%expect 156
with
%expect 157
for some reason .prf file conatins PRECISION for double in upper case breaking whole thing, this should be resolved along with some major parser update
9) Finally modify include/lex.h (which you copied from the server sources) and add

namespace mysql_parser

around all the code, right after

#include "lex_symbol.h"

and add

{ "EDIT",             SYM(EDIT_SYM)},

after the "EACH" entry in the symbol table.

10) The new server is now ready to be compiled in WB.

By adjusting the batch files it is probably possible to avoid the two copy operations.

=======================================================================================================================

Some notes regarding the server generation (from Sergei):

Server grammar uses middle rule actions hardly. In some cases they implicitly resolve reduce/reduce problems. This time, for the first time, I had to add dummy middle rule action to get rid of reduce/reduce conflict. Look for part_value_item: rule in \library\sql-parser\yy_gen-tool\sql_yacc_5_5_5_m3_tailored.yy.
To avoid a headache of resolving this kind of conflicts it makes sense to make tools preserve middle rule actions' placeholders.
Because of changes in grammar file sql parser code can fail parse some grammatic constructions and fail the tut-tests. Sso better to run at least "mysql_sql_parser" & "mysql_sql_statement_decomposer" tut-tests to be on a safe side.
sql -> grt structs code relies on AST structure which can change after update.