File: parser.rst

package info (click to toggle)
pyroute2 0.8.1-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 3,700 kB
  • sloc: python: 50,245; makefile: 280; javascript: 183; ansic: 81; sh: 44; awk: 17
file content (310 lines) | stat: -rw-r--r-- 11,981 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
.. parser:

.. raw:: html

   <span id="fold-sources" />

Netlink parser data flow
========================

NetlinkSocketBase: receive the data
-----------------------------------

When `NetlinkSocketBase` receives the data from a netlink socket, it can do it
in two ways:

1. get data directly with `socket.recv()` or `socket.recv_into()`
2. run a buffer thread that receives the data asap and leaves in the
   `buffer_queue` to be consumed later by `recv()` or `recv_into()`

`NetlinkSocketBase` implements these two receive methods, that choose
the data source -- directly from the socket or from `buffer_queue` --
depending on the `buffer_thread` property:

**pyroute2.netlink.nlsocket.NetlinkSocketBase**

.. code-include:: :func:`pyroute2.netlink.nlsocket.NetlinkSocketBase.recv`
    :language: python

.. code-include:: :func:`pyroute2.netlink.nlsocket.NetlinkSocketBase.recv_into`
    :language: python

.. code-include:: :func:`pyroute2.netlink.nlsocket.NetlinkSocketBase.buffer_thread_routine`
    :language: python

.. aafig::
    :scale: 80
    :textual:

    `                                                                  `
    data flow     struct        marshal
    +--------+    +--------+    +------------+
    |        |--->| bits   |    |            |
    |        |    |     32 |--->| length     | 4 bytes, offset 0
    |        |    +--------+    +------------+
    |        |    |     16 |--->| type-> key | 2 bytes, offset 4
    |        |    +--------+    +------------+
    |        |    |     16 |    | flags      | 2 bytes, offset 6
    |        |    +--------+    +------------+
    |        |    |        |    | `sequence` |
    |        |    |     32 |--->| `number`   | 4 bytes, offset 8
    |        |    +--------+    +------------+
    |        |    |        |
    |        |    |     32 |      pid (ignored by marshal)
    |        |    +--------+
    |        |    |        |

    |        |    |        |      payload (ignored by marshal)

    |        |    |        |    \                        /
    +--------+    +--------+     ---+--------------------
                                    |
                                    |
                                    |         / `marshal.msg_map = {`
                                    |        |
                                    |        |    key-> parser,
                                    |        |
                                    +--------+    key-> parser,
                                    |        |
                                    |        |    key-> parser,
                                    |        |
                                    |         \  `}`
                                    |
                                    v

Marshal: get and run parsers
----------------------------

Marshal should choose a proper parser depending on the `key`, `flags` and
`sequence_number`. By default it uses only `nlmsg->type` as the `key` and
`nlmsg->flags`, and there are several ways to customize getting parsers.

1. Use custom `key_format`, `key_offset` and `key_mask`. The latter is used
   to partially match the key, while `key_format` and `key_offset` are used
   to `struct.unpack()` the key from the raw netlink data.
2. You can overload `Marshal.get_parser()` and implement your own way to
   get parsers. A parser should be a simple function that gets only
   `data`, `offset` and `length` as arguments, and returns one dict compatible
   message.


.. aafig::
    :scale: 80
    :textual:

    `                                                                  `
                                    |
                                    |
                                    |
                                    |
                                    |
                                    v
              `if marshal.key_format is not None:`

                      `marshal.key_format`\
                                           |
                      `marshal.key_offset` +-- custom key
                                           |
                      `marshal.key_mask`  /

              `parser = marshal.get_parser(key, flags, sequence_number)`

              `msg = parser(data, offset, length)`

                                    |
                                    |
                                    |
                                    |
                                    |
                                    v

**pyroute2.netlink.nlsocket.Marshal**

.. code-include:: :func:`pyroute2.netlink.nlsocket.Marshal.parse`
    :language: python

The message parser routine must accept `data, offset, length` as the
arguments, and must return a valid `nlmsg` or `dict`, with the mandatory
fields, see the spec below. The parser can also return `None` which tells
the marshal to skip this message. The parser must parse data for one
message.

Mandatory message fields, expected by NetlinkSocketBase methods:

.. code-block:: python

    {
        'header': {
            'type': int,
            'flags': int,
            'error': None or NetlinkError(),
            'sequence_number': int,
        }
    }

.. aafig::
    :scale: 80
    :textual:

    `                                                                  `
                                    |
                                     
                                    |
                                     
                                    |
                                    v
              parsed msg
              +-------------------------------------------+
              | header                                    |
              |        `{`                                |
              |             `uint32 length,`              |
              |             `uint16 type,`                |
              |             `uint16 flags,`               |
              |             `uint32 sequence_number,`     |
              |             `uint32 pid,`                 |
              |        `}`                                |
              +- - - - - - - - - - - - - - - - - - - - - -+
              | data fields (optional)                    |
              |        `{`                                |
              |             `int field,`                  |
              |             `int field,`                  |
              |        `}`                                |
              | or                                        |
              |        `string field`                     |
              |                                           |
              +- - - - - - - - - - - - - - - - - - - - - -+
              | nla chain                                 |
              |                                           |
              |         +-------------------------------+ |
              |         | header                        | |
              |         |        `{`                    | |
              |         |             `uint16 length,`  | |
              |         |             `uint16 type,`    | |
              |         |        `}`                    | |
              |         +- - - - - - - - - - - - - - - -+ |
              |         | data fields (optional)        | |
              |         |                               | |
              |         |        ...                    | |
              |         |                               | |
              |         +- - - - - - - - - - - - - - - -+ |
              |         | nla chain                     | |
              |         |                               | |
              |         |        recursive              | |
              |         |                               | |
              |         +-------------------------------+ |
              |                                           |
              +-------------------------------------------+

Per-request parsers
-------------------

Sometimes, it may be reasonable to handle a particular response with a
specific parser rather than a generic one. An example is
`IPRoute.get_default_routes()`, which could be slow on systems with
huge amounts of routes.

Instead of parsing every route record as `rtmsg`, this method assigns
a specific parser to its request. The custom parser doesn't parse records
blindly, but looks up only for default route records in the dump, and
then parses only matched records with the standard routine:

**pyroute2.iproute.linux.IPRoute**

.. code-include:: :func:`pyroute2.iproute.linux.RTNL_API.get_default_routes`
    :language: python

**pyroute2.iproute.parsers**

.. code-include:: :func:`pyroute2.iproute.parsers.default_routes`
    :language: python

To assign a custom parser to a request/response communication, you should
know first `sequence_number`, be it allocated dynamically with
`NetlinkSocketBase.addr_pool.alloc()` or assigned statically. Then you
can create a record in `NetlinkSocketBase.seq_map`:

.. code-block:: python

    #
    def my_parser(data, offset, length):
        ...
        return parsed_message

    msg_seq = nlsocket.addr_pool.alloc()
    msg = nlmsg()
    msg['header'] = {
        'type': my_type,
        'flags': NLM_F_REQUEST | NLM_F_ACK,
        'sequence_number': msg_seq,
    }
    msg['data'] = my_data
    msg.encode()
    nlsocket.seq_map[msg_seq] = my_parser
    nlsocket.sendto(msg.data, (0, 0))
    for reponse_message in nlsocket.get(msg_seq=msg_seq):
        handle(response_message)


NetlinkSocketBase: pick correct messages
----------------------------------------

The netlink protocol is asynchronous, so responses to several requests may
come simultaneously. Also the kernel may send broadcast messages that are
not responses, and have `sequence_number == 0`. As the response *may* contain
multiple messages, and *may* or *may not* be terminated by some specific type
of message, the task of returning relevant messages from the flow is a bit
complicated.

Let's look at an example:

.. aafig::
    :scale: 80
    :textual:

            +-----------+    +-----------+
            |  program  |    |   kernel  |
            +-----+-----+    +-----+-----+
                  |                |
                  |                |
                  |                | random broadcast
                  |<---------------|
                  |                |
                  |                |
    request seq 1 X                |
                  X--------------->X
                  X                X
                  X                X
                  X                X random broadcast
                  X<---------------X
                  X                X
                  X                X
                  X                X `response seq 1`
                  X<---------------X `flags: NLM_F_MULTI`
                  X                X
                  X                X
                  X                X random broadcast
                  X<---------------X
                  X                X
                  X                X
                  X                X `response seq 1`
                  X<---------------X `type: NLMSG_DONE`
                  X                |
                  |                |
                  v                v

The message flow on the diagram features `sequence_number == 0` broadcasts and
`sequence_number == 1` request and response packets. To complicate it even
further you can run a request with `sequence_number == 2` before the final
response with `sequence_number == 1` comes.

To handle that, `NetlinkSocketBase.get()` buffers all the irrelevant messages,
returns ones with only the requested `sequence_number`, and uses locks to wait
on the resource.

The current implementation is relatively complicated and will be changed in
the future.

**pyroute2.netlink.nlsocket.NetlinkSocketBase**

.. code-include:: :func:`pyroute2.netlink.nlsocket.NetlinkSocketBase.get`
    :language: python