File: ast.rst

package info (click to toggle)
python-tatsu 5.15.1%2Bds-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 904 kB
  • sloc: python: 10,128; makefile: 54
file content (51 lines) | stat: -rw-r--r-- 2,099 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
.. include:: links.rst

Abstract Syntax Trees (ASTs)
----------------------------

By default, an `AST`_ is either:

* a *value*, for simple elements such as *token*, *pattern*, or *constant*
* a ``tuple``, for *closures*, *gatherings*, and the right-hand-side of rules with more than one element but without named elements
* a ``dict``-derived object (``AST``) that contains one item for every named element in the grammar rule, with items can be accessed through the standard ``dict`` syntax (``ast['key']``), or as attributes (``ast.key``).

`AST`_ entries are single values if only one item was associated with a
name, or ``tuple`` if more than one item was matched. There's a provision in
the grammar syntax (the ``+:`` operator) to force an `AST`_ entry to be
a ``tuple`` even if only one element was matched. The value for named
elements that were not found during the parse (perhaps because they are
optional) is ``None``.

When the ``parseinfo=True`` keyword argument has been passed to the
``Parser`` constructor or enabled with the ``@@parseinfo`` directive, a ``parseinfo`` item is added to `AST`_ nodes
that are *dict*-like. The item contains a ``collections.namedtuple``
with the parse information for the node:

.. code:: python

    class ParseInfo(NamedTuple):
        tokenizer: Any
        rule: str
        pos: int
        endpos: int
        line: int
        endline: int
        alerts: list[Alert] = []  # noqa: RUF012

        def text_lines(self):
            return self.tokenizer.get_lines(self.line, self.endline)

        def line_index(self):
            return self.tokenizer.line_index(self.line, self.endline)

        @property
        def buffer(self):
            return self.tokenizer

With the help of the ``Tokenizer.line_info()`` method, it is possible to
recover the line, column, and original text parsed for the node. Note
that when ``ParseInfo`` is generated, the ``Tokenizer`` used during parsing
is kept in memory for the lifetime of the `AST`_.

Generation of ``parseinfo`` can also be controlled using the
``@@parseinfo :: True`` grammar directive.