File: models.rst

package info (click to toggle)
python-tatsu 5.13.1%2Bds-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 892 kB
  • sloc: python: 10,202; makefile: 54
file content (169 lines) | stat: -rw-r--r-- 5,025 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
.. include:: links.rst


Models
------


Building Models
~~~~~~~~~~~~~~~

Naming elements in grammar rules makes the parser discard uninteresting
parts of the input, like punctuation, to produce an *Abstract Syntax
Tree* (`AST`_) that reflects the semantic structure of what was parsed.
But an `AST`_ doesn't carry information about the rule that generated
it, so navigating the trees may be difficult.

|TatSu| defines the ``tatsu.model.ModelBuilderSemantics`` semantics
class which helps construct object models from abtract syntax trees:

.. code:: python

    from tatsu.model import ModelBuilderSemantics

    parser = MyParser(semantics=ModelBuilderSemantics())

Then you add the desired node type as first parameter to each grammar
rule:

.. code:: ocaml

    addition::AddOperator = left:mulexpre '+' right:addition ;

``ModelBuilderSemantics`` will synthesize a ``class AddOperator(Node):``
class and use it to construct the node. The synthesized class will have
one attribute with the same name as the named elements in the rule.

You can also use `Python`_'s built-in types as node types, and
``ModelBuilderSemantics`` will do the right thing:

.. code:: ocaml

    integer::int = /[0-9]+/ ;

``ModelBuilderSemantics`` acts as any other semantics class, so its
default behavior can be overidden by defining a method to handle the
result of any particular grammar rule.



Viewing Models as JSON
~~~~~~~~~~~~~~~~~~~~~~


Models generated by |TatSu| can be viewed by converting them to a JSON-compatible structure
with the help of ``tatsu.util.asjson()``. The protocol tries to provide the best
representation for common types, and can handle any type using ``repr()``. There are provisions for structures with back-references, so there's no infinite recursion.

.. code:: python

    import json

    print(json.dumps(asjson(model), indent=2))

The ``model``, with richer semantics, remains unaltered.

Conversion to a JSON-compatible structure relies on the protocol defined by
``tatsu.utils.AsJSONMixin``.  The mixin defines a ``__json__(seen=None)``
method that allows classes to define their best translation. You can use ``AsJSONMixin``
as a base class in your own models to take advantage of ``asjson()``, and you can
specialize the conversion by overriding ``AsJSONMixin.__json__()``.

You can also write your own version of ``asjson()`` to handle special cases that are recurrent in your context.

Walking Models
~~~~~~~~~~~~~~

The class ``tatsu.model.NodeWalker`` allows for the easy traversal
(*walk*) a model constructed with a ``ModelBuilderSemantics`` instance:

.. code:: python

    from tatsu.model import NodeWalker

    class MyNodeWalker(NodeWalker):

        def walk_AddOperator(self, node):
            left = self.walk(node.left)
            right = self.walk(node.right)

            print('ADDED', left, right)

    model = MyParser(semantics=ModelBuilderSemantics()).parse(input)

    walker = MyNodeWalker()
    walker.walk(model)

When a method with a name like ``walk_AddOperator()`` is defined, it
will be called when a node of that type is *walked*. The *pythonic*
version of the class name may also be used for the *walk* method:
``walk__add_operator()`` (note the double underscore).

If a *walk* method for a node class is not found, then a method for the
class's bases is searched, so it is possible to write *catch-all*
methods such as:

.. code:: python

    def walk_Node(self, node):
        print('Reached Node', node)

    def walk_str(self, s):
        return s

    def walk_object(self, o):
        raise Exception(f'Unexpected type {type(o).__name__} walked')

Which nodes get *walked* is up to the ``NodeWalker`` implementation. Some
strategies for walking *all* or *most* nodes are implemented as classes
in ``tatsu.wakers``,  such as ``PreOrderWalker`` and ``DepthFirstWalker``.

Sometimes nodes must be walked more than once for the purpose at hand, and it's
up to the walker how and when to do that.

Take a look at ``tatsu.ngcodegen.PythonCodeGenerator`` for the walker that generates
a parser in Python from the model of a parsed grammar.


Model Class Hierarchies
~~~~~~~~~~~~~~~~~~~~~~~

It is possible to specify a a base class for generated model nodes:

.. code:: ocaml

    additive
        =
        | addition
        | substraction
        ;

    addition::AddOperator::Operator
        =
        left:mulexpre op:'+' right:additive
        ;

    substraction::SubstractOperator::Operator
        =
        left:mulexpre op:'-' right:additive
        ;

|TatSu| will generate the base class if it's not already known.

Base classes can be used as the target class in *walkers*, and in *code
generators*:

.. code:: python

    class MyNodeWalker(NodeWalker):
        def walk_Operator(self, node):
            left = self.walk(node.left)
            right = self.walk(node.right)
            op = self.walk(node.op)

            print(type(node).__name__, op, left, right)


    class Operator(ModelRenderer):
        template = '{left} {op} {right}'