File: meta.rst

package info (click to toggle)
construct 2.10.68%2Bdfsg1-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,812 kB
  • sloc: python: 11,793; makefile: 135
file content (226 lines) | stat: -rw-r--r-- 8,646 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
===========
The Context
===========


Meta constructs are the key to the declarative power of Construct. Meta constructs are constructs which are affected by the context of the construction (during parsing and building). The context is a dictionary that is created during the parsing and building process by Structs and Sequences, and is "propagated" down and up to all constructs along the way, so that other members can access other members parsing or building results. It basically represents a mirror image of the construction tree, as it is altered by the different constructs. Nested structs create nested contexts, just as they create nested containers.

In order to see the context, let's try this snippet:

>>> st = Struct(
...     "a" / Byte,
...     Probe(),
...     "b" / Byte,
...     Probe(),
... )
>>> st.parse(b"\x01\x02")
--------------------------------------------------
Container: 
    a = 1
--------------------------------------------------
Container: 
    a = 1
    b = 2
--------------------------------------------------
Container(a=1, b=2)

As you can see, the context looks different at different points of the construction.

You may wonder what does the little underscore ('_') that is found in the context means. It basically represents the parent node, like the '..' in unix pathnames ('../foo.txt'). We'll use it only when we refer to the context of upper layers.

Using the context is easy. All meta constructs take a function as a parameter, which is usually passed as a lambda function, although "big" named functions are just as good. This function, unless otherwise stated, takes a single parameter called ctx (short for context), and returns a result calculated from that context.

>>> st = Struct(
...     "count" / Byte,
...     "data" / Bytes(this.count),
... )
>>> st.parse(b"\x05abcde")
Container(count=5, data=b'abcde')

Of course a function can return anything (it does not need to depend on the context):

>>> Computed(lambda ctx: 7)
>>> Computed(lambda ctx: os.urandom(16))



Nesting
============================

And here's how we use the special "_" name to get to the upper container in a nested containers situation (which happens when parsing nested Structs). Notice that `length1` is on different (upper) level than `length2`, therefore it exists within a different up-level containter.

>>> st = Struct(
...     "length1" / Byte,
...     "inner" / Struct(
...         "length2" / Byte,
...         "sum" / Computed(this._.length1 + this.length2),
...     ),
... )
>>> st.parse(b"12")
Container(length1=49, inner=Container(length2=50, sum=99))

Context entries can also be passed directly through `parse` and `build` methods. However, one should take into account that some classes are nesting context (like Struct Sequence Union FocusedSeq LazyStruct), so entries passed to these end up on upper level. Compare examples:

>>> d = Bytes(this.n)
>>> d.parse(bytes(100), n=4)
b'\x00\x00\x00\x00'

>>> d = Struct(
...     "data" / Bytes(this._.n),
... )
>>> d.parse(bytes(100), n=4)
Container(data=b'\x00\x00\x00\x00')


Refering to inlined constructs
============================

If you need to refer to a subcon like Enum, that was inlined in the struct (and therefore wasnt assigned to any variable in the namespace), you can access it as Struct attribute under same name. This feature is particularly handy when using Enums and EnumFlags.

>>> d = Struct(
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'


If you need to refer to the size of a field, that was inlined in the same struct (and therefore wasnt assigned to any variable in the namespace), you can use a special "_subcons" context entry that contains all Struct members. Note that you need to use a lambda (because `this` expression is not supported).

>>> d = Struct(
...     "count" / Byte,
...     "data" / Bytes(lambda this: this.count - this._subcons.count.sizeof()),
... )
>>> d.parse(b"\x05four")
Container(count=5)(data=b'four')

>>> d = Union(None,
...     "chars" / Byte[4],
...     "data" / Bytes(lambda this: this._subcons.chars.sizeof()),
... )
>>> d.parse(b"\x01\x02\x03\x04")
Container(chars=[1, 2, 3, 4], data=b'\x01\x02\x03\x04')

This feature is supported in same constructs as embedding: Struct Sequence FocusedSeq Union LazyStruct.


Using `this` expression
============================

Certain classes take a number of elements, or something similar, and allow a callable to be provided instead. This callable is called at parsing and building, and is provided the current context object. Context is always a Container, not a dict, so it supports attribute as well as key access. Amazingly, this can get even more fancy. Tomer Filiba provided an even better syntax. The `this` singleton object can be used to build a lambda expression. All four examples below are equivalent, but first is recommended:

>>> this._.field
>>> lambda this: this._.field
>>> this["_"]["field"]
>>> lambda this: this["_"]["field"]

Of course, `this` expression can be mixed with other calculations. When evaluating, each instance of `this` is replaced by context Container which supports attribute access to keys.

>>> this.width * this.height - this.offset

When creating an Array ("items" field), rather than specifying a constant count, you can use a previous field value as count.

>>> st = Struct(
...     "count" / Rebuild(Byte, len_(this.items)),
...     "items" / Byte[this.count],
... )
>>> st.build(dict(items=[1,2,3,4,5]))
b'\x05\x01\x02\x03\x04\x05'

Switch can branch the construction path based on previously parsed value.

>>> st = Struct(
...     "type" / Enum(Byte, INT1=1, INT2=2, INT4=3, STRING=4),
...     "data" / Switch(this.type,
...     {
...         "INT1" : Int8ub,
...         "INT2" : Int16ub,
...         "INT4" : Int32ub,
...         "STRING" : String(10),
...     }),
... )
>>> st.parse(b"\x02\x00\xff")
Container(type='INT2', data=255)
>>> st.parse(b"\x04\abcdef\x00\x00\x00\x00")
Container(type='STRING', data=b'\x07bcdef')



Using `len_` expression
============================

There used to be a bit of a hassle when you used built-in functions like `len sum min max abs` on context items. Built-in `len` takes a list and returns an integer but `len_` analog takes a lambda and returns a lambda. This allows you to use this kind of shorthand:

>>> len_(this.items)
>>> lambda this: len(this.items)

These can be used in newly added Rebuild wrappers that compute count/length fields from another list-alike field:

>>> st = Struct(
...     "count" / Rebuild(Byte, len_(this.items)),
...     "items" / Byte[this.count],
... )
>>> st.build(dict(items=[1,2,3,4,5]))
b'\x05\x01\x02\x03\x04\x05'



Using `obj_` expression
============================

There is also an analog that takes (obj, context) or (obj, list, context) unlike `this` singleton which only takes a context (a single parameter):

>>> obj_ > 0
>>> lambda obj,ctx: obj > 0

These can be used in at least one construct:

>>> RepeatUntil(obj_ == 0, Byte).parse(b"aioweqnjkscs\x00")
[97, 105, 111, 119, 101, 113, 110, 106, 107, 115, 99, 115, 0]



Using `list_` expression
============================

.. warning:: The `list_` expression is implemented but buggy, using it is not recommended at present time.

There is also a third expression that takes (obj, list, context) and computes on the second parameter (the list). In constructs that use lambdas with all 3 parameters, those constructs usually process lists of elements and the 2nd parameter is a list of elements processed so far.

These can be used in at least one construct: 

>>> RepeatUntil(list_[-1] == 0, Byte).parse(b"aioweqnjkscs\x00")
[97, 105, 111, 119, 101, 113, 110, 106, 107, 115, 99, 115, 0]

In that example, `list_` gets substituted with following, at each iteration. Index -1 means last element:

::

    list_ <- [97]
    list_ <- [97, 105]
    list_ <- [97, 105, 111]
    list_ <- [97, 105, 111, 119]
    ...

Known deficiencies
============================

Logical ``and`` ``or`` ``not`` operators cannot be used in this expressions. You have to either use a lambda or equivalent bitwise operators:

>>> ~this.flag1 | this.flag2 & this.flag3
>>> lambda this: not this.flag1 or this.flag2 and this.flag3

Contains operator ``in`` cannot be used in this expressions, you have to use a lambda:

>>> lambda this: this.value in (1, 2, 3)

Indexing (square brackets) do not work in this expressions. Use a lambda:

>>> lambda this: this.list[this.index]

Sizeof method does not work in this expressions. Use a lambda:

>>> lambda this: this._subcons.<member>.sizeof()

Lambdas (unlike this expressions) are not compilable.