File: converters.rst

package info (click to toggle)
python-msgspec 0.19.0-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 6,356 kB
  • sloc: javascript: 23,944; ansic: 20,540; python: 20,465; makefile: 29; sh: 19
file content (138 lines) | stat: -rw-r--r-- 4,904 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
Converters
==========

.. currentmodule:: msgspec

``msgspec`` provides builtin support for several common protocols (``json``,
``msgpack``, ``yaml``, and ``toml``). Support for additional protocols may be
added by combining a serialization library with msgspec's *converter
functions*: `msgspec.to_builtins` and `msgspec.convert`.

- `msgspec.to_builtins`: takes an object composed of any :doc:`supported type
  <supported-types>` and converts it into one composed of only simple builtin
  types typically supported by Python serialization libraries.

- `msgspec.convert`: takes an object composed of any :doc:`supported type
  <supported-types>`, and converts it to match a specified schema (validating
  along the way). If the conversion fails due to a schema mismatch, a nice
  error message is raised.

These functions are designed to be paired with a Python serialization library as
pre/post processors for typical ``dumps`` and ``loads`` functions.

.. image:: _static/converters-light.svg
   :align: center
   :class: only-light

.. image:: _static/converters-dark.svg
   :align: center
   :class: only-dark

For example, if ``msgspec`` didn't already provide support for ``json``, you
could add support by wrapping the standard library's `json` module as follows:

.. code-block:: ipython

    In [1]: import json
       ...: from typing import Any
       ...:
       ...: import msgspec

    In [2]: def encode(obj):
       ...:     return json.dumps(msgspec.to_builtins(obj))

    In [3]: def decode(msg, type=Any):
       ...:     return msgspec.convert(json.loads(msg), type=type)

    In [4]: class Point(msgspec.Struct):
       ...:     x: int
       ...:     y: int

    In [5]: x = Point(1, 2)

    In [6]: msg = encode(x)  # Encoding a high-level type works

    In [7]: msg
    '{"x": 1, "y": 2}'

    In [8]: decode(msg, type=Point)  # Decoding a high-level type works
    Point(x=1, y=2)

    In [9]: decode('{"x": "oops", "y": 2}', type=Point)  # Schema mismatches error
    ---------------------------------------------------------------------------
    ValidationError                           Traceback (most recent call last)
    Cell In[9], line 1
    ----> 1 decode('{"x": "oops", "y": 2}', type=Point)  # Schema mismatches error

    Cell In[3], line 2, in decode(msg, type)
         1 def decode(msg, type=Any):
    ---> 2     return msgspec.convert(json.loads(msg), type=type)

    ValidationError: Expected `int`, got `str` - at `$.x`


Since all protocols are different, `to_builtins` and `convert` have a few
configuration options:

- ``builtin_types``: an iterable of additional types to treat as builtin types,
  beyond the standard `dict`, `list`, `tuple`, `set`, `frozenset`, `str`,
  `int`, `float`, `bool`, and `None`.

- ``str_keys``: whether the wrapped protocol only supports strings for object
  keys, rather than any hashable type.

- ``strict``: `convert` only. Whether type coercion rules should be strict.
  Defaults is True, setting to False enables a wider set of coercion rules from
  string to non-string types for all values. Among other uses, this may be used
  to handle completely untyped protocols like URL querystrings, where only
  string values exist. See :ref:`strict-vs-lax` for more information.

- ``from_attributes``: `convert` only. If True, input objects may be coerced
  to ``Struct``/``dataclass``/``attrs`` types by extracting attributes from the
  input matching fields in the output type. One use case is converting database
  query results (ORM or otherwise) to msgspec structured types. The default is
  False.

- ``enc_hook``/``dec_hook``: the standard keyword arguments used for
  :doc:`extending` msgspec to support additional types.

-----

Taking a look at another protocol - TOML_. This protocol

- Includes native support for `datetime.datetime`, `datetime.date`, and
  `datetime.time` types.
- Only supports strings for object keys.

If ``msgspec`` didn't already provide support for ``toml``, you could add
support by wrapping the standard library's `tomllib` module as follows:

.. code-block:: python

    import datetime
    import tomllib
    from typing import Any

    import msgspec

    def decode(msg, *, type=Any, dec_hook=None):
        return msgspec.convert(
            toml.loads(msg),
            type,
            builtin_types=(datetime.datetime, datetime.date, datetime.time),
            str_keys=True,
            dec_hook=dec_hook,
        )

``msgspec`` uses these APIs to implement ``toml`` and ``yaml`` support,
wrapping external serialization libraries:

- ``msgspec.toml`` (`code <https://github.com/jcrist/msgspec/blob/main/msgspec/toml.py>`__)

- ``msgspec.yaml`` (`code <https://github.com/jcrist/msgspec/blob/main/msgspec/yaml.py>`__)

The implementation in ``msgspec.toml`` is *almost* identical to the one above,
with some additional code for error handling.


.. _TOML: https://toml.io