File: preconf.md

package info (click to toggle)
python-cattrs 25.3.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,812 kB
  • sloc: python: 12,236; makefile: 155
file content (205 lines) | stat: -rw-r--r-- 8,803 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
# Preconfigured Converters

The {mod}`cattrs.preconf` package contains factories for preconfigured converters, specifically adjusted for particular serialization libraries.

For example, to get a converter configured for _orjson_:

```{doctest}

>>> from cattrs.preconf.orjson import make_converter

>>> converter = make_converter() # Takes the same parameters as the `cattrs.Converter`
```

Converters obtained this way can be customized further, just like any other converter.

For compatibility and performance reasons, these converters are usually configured to unstructure differently than ordinary `Converters`.
A couple of examples:
* the {class}`_orjson_ converter <cattrs.preconf.orjson.OrjsonConverter>` is configured to pass `datetime` instances unstructured since _orjson_ can handle them faster.
* the {class}`_msgspec_ JSON converter <cattrs.preconf.msgspec.MsgspecJsonConverter>` is configured to pass through some dataclasses and _attrs_classes,
if the output is identical to what normal unstructuring would have produced, since _msgspec_ can handle them faster.

The intended usage is to pass the unstructured output directly to the underlying library,
or use `converter.dumps` which will do it for you.

These converters support all [default hooks](defaulthooks.md)
and the following additional classes and type annotations,
both for structuring and unstructuring:

- `datetime.datetime`, `datetime.date`

```{versionadded} 22.1.0
All preconf converters now have `loads` and `dumps` methods, which combine un/structuring and the de/serialization logic from their underlying libraries.
```

```{doctest}

>>> from cattrs.preconf.json import make_converter

>>> converter = make_converter()

>>> @define
... class Test:
...     a: int

>>> converter.dumps(Test(1))
'{"a": 1}'
```

Particular libraries may have additional constraints documented below.

Third-party libraries can be specified as optional (extra) dependencies on _cattrs_ during installation.
Optional install targets should match the name of the {mod}`cattrs.preconf` modules.

```console
# Using pip
$ pip install cattrs[ujson]

# Using pdm
$ pdm add cattrs[orjson]

# Using poetry
$ poetry add --extras tomlkit cattrs
```


## Standard Library _json_

Found at {mod}`cattrs.preconf.json`.

Bytes are serialized as base 85 strings. Counters are serialized as dictionaries. Sets are serialized as lists, and deserialized back into sets. `datetime` s and `date` s are serialized as ISO 8601 strings.


## _orjson_

Found at {mod}`cattrs.preconf.orjson`.

Bytes are un/structured as base 85 strings.
Sets are unstructured into lists, and structured back into sets.
`datetime` s and `date` s are passed through to be unstructured into RFC 3339 by _orjson_ itself.
Typed named tuples are unstructured into ordinary tuples, and then into JSON arrays by _orjson_.

_orjson_ doesn't support integers less than -9223372036854775808, and greater than 9223372036854775807.
_orjson_ only supports mappings with string keys so mappings will have their keys stringified before serialization, and destringified during deserialization.


## _msgspec_

Found at {mod}`cattrs.preconf.msgspec`.
Only JSON functionality is currently available, other formats supported by msgspec to follow in the future.

[_msgspec_ structs](https://jcristharif.com/msgspec/structs.html) are supported, but not composable - a struct will be handed over to _msgspec_ directly, and _msgspec_ will handle and all of its fields, recursively.
_cattrs_ may get more sophisticated handling of structs in the future.

[_msgspec_ strict mode](https://jcristharif.com/msgspec/usage.html#strict-vs-lax-mode) is used by default.
This can be customized by changing the {meth}`encoder <cattrs.preconf.msgspec.MsgspecJsonConverter.encoder>` attribute on the converter.

What _cattrs_ calls _unstructuring_ and _structuring_, _msgspec_ calls [`to_builtins` and `convert`](https://jcristharif.com/msgspec/converters.html).
What _cattrs_ refers to as _dumping_ and _loading_, _msgspec_ refers to as [`encoding` and `decoding`](https://jcristharif.com/msgspec/usage.html).

Compatibility notes:
- Bytes are un/structured as base 64 strings directly by _msgspec_ itself.
- _msgspec_ [encodes special float values](https://jcristharif.com/msgspec/supported-types.html#float) (`NaN, Inf, -Inf`) as `null`.
- `datetime` s and `date` s are passed through to be unstructured into RFC 3339 by _msgspec_ itself.
- _attrs_ classes, dataclasses and sequences are handled directly by _msgspec_ if possible, otherwise by the normal _cattrs_ machinery.
This means it's possible the validation errors produced may be _msgspec_ validation errors instead of _cattrs_ validation errors.

This converter supports {meth}`get_loads_hook() <cattrs.preconf.msgspec.MsgspecJsonConverter.get_loads_hook>` and {meth}`get_dumps_hook() <cattrs.preconf.msgspec.MsgspecJsonConverter.get_loads_hook>`.
These are factories for dumping and loading functions (as opposed to unstructuring and structuring); the hooks returned by this may be further optimized to offload as much work as possible to _msgspec_.

```python
>>> from cattrs.preconf.msgspec import make_converter

>>> @define
... class Test:
...     a: int

>>> converter = make_converter()
>>> dumps = converter.get_dumps_hook(A)

>>> dumps(Test(1))  # Will use msgspec directly.
b'{"a":1}'
```

Due to its complexity, this converter is currently _provisional_ and may slightly change as the best integration patterns are discovered.

_msgspec_ doesn't support PyPy.

```{versionadded} 24.1.0

```

## _ujson_

Found at {mod}`cattrs.preconf.ujson`.

Bytes are serialized as base 85 strings. Sets are serialized as lists, and deserialized back into sets. `datetime` s and `date` s are serialized as ISO 8601 strings.

_ujson_ doesn't support integers less than -9223372036854775808, and greater than 9223372036854775807, nor does it support `float('inf')`.


## _msgpack_

Found at {mod}`cattrs.preconf.msgpack`.

Sets are serialized as lists, and deserialized back into sets. `datetime` s are serialized as UNIX timestamp float values. `date` s are serialized as midnight-aligned UNIX timestamp float values.

_msgpack_ doesn't support integers less than -9223372036854775808, and greater than 18446744073709551615.

When parsing msgpack data from bytes, the library needs to be passed `strict_map_key=False` to get the full range of compatibility.


## _cbor2_

Found at {mod}`cattrs.preconf.cbor2`.

_cbor2_ implements a fully featured CBOR encoder with several extensions for handling shared references, big integers, rational numbers and so on.

Sets are serialized and deserialized to sets.
Tuples are serialized as lists.

`datetime` s are serialized as a text string by default (CBOR Tag 0).
Use keyword argument `datetime_as_timestamp=True` to encode as UNIX timestamp integer/float (CBOR Tag 1)
**note:** this replaces timezone information as UTC.

`date` s are serialized as ISO 8601 strings.

Use keyword argument `canonical=True` for efficient encoding to the smallest binary output.

Floats can be forced to smaller output by casting to lower-precision formats by casting to `numpy` floats (and back to Python floats).
Example: `float(np.float32(value))` or `float(np.float16(value))`

```{versionadded} 23.1.0

```

## _bson_

Found at {mod}`cattrs.preconf.bson`. Tested against the _bson_ module bundled with the _pymongo_ library, not the standalone PyPI _bson_ package.

Sets are serialized as lists, and deserialized back into sets.

_bson_ doesn't support integers less than -9223372036854775808 or greater than 9223372036854775807 (64-bit signed).
_bson_ does not support null bytes in mapping keys.
_bson_ only supports mappings with string keys so mappings will have their keys stringified before serialization, and destringified during deserialization.
The _bson_ datetime representation doesn't support microsecond accuracy.
`date` s are serialized as ISO 8601 strings.

When encoding and decoding, the library needs to be passed `codec_options=bson.CodecOptions(tz_aware=True)` to get the full range of compatibility.


## _pyyaml_

Found at {mod}`cattrs.preconf.pyyaml`.

Frozensets are serialized as lists, and deserialized back into frozensets.
`date` s are serialized as ISO 8601 strings.
Typed named tuples are unstructured into ordinary tuples, and then into YAML arrays by _pyyaml_.

## _tomlkit_

Found at {mod}`cattrs.preconf.tomlkit`.

Bytes are serialized as base 85 strings. Sets are serialized as lists, and deserialized back into sets.
Tuples are serialized as lists, and deserialized back into tuples.
_tomlkit_ only supports mappings with string keys so mappings will have their keys stringified before serialization, and destringified during deserialization. `date` s are serialized as ISO 8601 strings.