File: directives.rst

package info (click to toggle)
python-tatsu 5.13.1%2Bds-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 892 kB
  • sloc: python: 10,202; makefile: 54
file content (137 lines) | stat: -rw-r--r-- 4,141 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
.. include:: links.rst


Grammar Directives
------------------

|TatSu| allows *directives* in the grammar that control the behavior of the generated parsers. All directives are of the form ``@@name :: <value>``. For example:

.. code::

    @@ignorecase :: True


The *directives* supported by |TatSu| are described below.


``@@grammar :: <word>``
~~~~~~~~~~~~~~~~~~~~~~~

    Specifies the name of the grammar, and provides the base name for the classes in parser source-code generation.


``@@comments :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies a regular expression to identify and exclude inline (bracketed) comments before the text is scanned by the parser. For ``(* ... *)`` comments:

.. code::

    @@comments :: /\(\*((?:.|\n)*?)\*\)/

.. note::
    In previous versions of |TatSu|, the `re.MULTILINE <https://docs.python .org/3/library/re.html#re.MULTILINE>`_
    option was enabled by default. This is no longer the case. Use ``(?m)`` at the start of your
    regular expressions to make them multi-line.

``@@eol_comments :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies a regular expression to identify and exclude end-of-line comments before the text is scanned by the parser. For ``# ...`` comments:

.. code::

    @@eol_comments :: /#([^\n]*?)$/

.. note::
    In previous versions of |TatSu|, the `re.MULTILINE <https://docs.python .org/3/library/re.html#re.MULTILINE>`_
    option was enabled by default. This is no longer the case. Use ``(?m)`` at the start of your
    regular expressions to make them multi-line.


``@@ignorecase :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~~

If set to ``True`` makes |TatSu| not consider case when parsing tokens. Defaults to ``False``:


.. code::

    @@ignorecase :: True


``@@keyword :: {<word>|<string>}+``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Specifies the list of strings or words that the grammar should consider as *"keywords"*.
May appear more than once. See the `Reserved Words and Keywords`_ section for an explanation.

.. _`Reserved Words and Keywords`: syntax.html#reserved-words-and-keywords


``@@left_recursion :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Enables left-recursive rules in the grammar. See the `Left Recursion`_ sections for an explanation.

.. _`Left Recursion`: left_recursion.html


``@@namechars :: <string>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

A list of (non-alphanumeric) characters that should be considered part of names when using the `@@nameguard`_ feature:

.. code::

    @@namechars :: '-_$'

.. _`@@nameguard`: #nameguard-bool


``@@nameguard :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~

When set to ``True``, avoids matching tokens when the next character in the input sequence is alphanumeric or a ``@@namechar``. Defaults to ``True``. See the `'text' expression`_ for an explanation.

.. code::

    @@nameguard :: False

.. _`'text' expression`: syntax.html?highlight=nameguard#text-or-text


``@@parseinfo :: <bool>``
~~~~~~~~~~~~~~~~~~~~~~~~~

When ``True``, the parser will add parse information to every ``AST`` and ``Node`` generated by the parse under a ``parseinfo`` field. The information will include:

* ``rule`` the rule name that parsed the node
* ``pos`` the initial position for the node in the input
* ``endpos`` the final position for the node in the input
* ``line`` the initial input line number for the element
* ``endline`` the final line number for the element

Enabling ``@@parseinfo`` will allow precise reporting over the input source-code while performing semantic actions.


``@@whitespace :: <regexp>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Provides a regular expression for the whitespace to be ignored by the parser. If no definition is
provided, then ``r'(?m)\s+'`` will be used as default:

.. code::

    @@whitespace :: /[\t ]+/

To disable any parsing of whitespace, use ``None`` for the definition:

.. code::

    @@whitespace :: None

.. note::
    In previous versions of |TatSu|, the `re.MULTILINE <https://docs.python .org/3/library/re.html#re.MULTILINE>`_
    option was enabled by default. This is no longer the case. Use ``(?m)`` at the start of your
    regular expressions to make them multi-line.