File: loads.rst

package info (click to toggle)
python-rapidjson 1.4-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 7,368 kB
  • sloc: cpp: 3,332; python: 1,990; makefile: 106
file content (299 lines) | stat: -rw-r--r-- 11,291 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
.. -*- coding: utf-8 -*-
.. :Project:   python-rapidjson -- loads function documentation
.. :Author:    Lele Gaifax <lele@metapensiero.it>
.. :License:   MIT License
.. :Copyright: © 2016, 2017, 2018, 2019, 2020 Lele Gaifax
..

==================
 loads() function
==================

.. currentmodule:: rapidjson

.. testsetup::

   from rapidjson import (dumps, loads, DM_NONE, DM_ISO8601, DM_UNIX_TIME,
                          DM_ONLY_SECONDS, DM_IGNORE_TZ, DM_NAIVE_IS_UTC, DM_SHIFT_TO_UTC,
                          UM_NONE, UM_CANONICAL, UM_HEX, NM_NATIVE, NM_DECIMAL, NM_NAN,
                          PM_NONE, PM_COMMENTS, PM_TRAILING_COMMAS)

.. function:: loads(string, *, object_hook=None, number_mode=None, datetime_mode=None, \
                    uuid_mode=None, parse_mode=None, allow_nan=True)

   Decode the given ``JSON`` formatted value into Python object.

   :param string: The JSON string to parse, either a Unicode :class:`str` instance or a
                  :class:`bytes` or a :class:`bytearray` instance containing an ``UTF-8``
                  encoded value
   :param callable object_hook: an optional function that will be called with the result
                                of any object literal decoded (a :class:`dict`) and should
                                return the value to use instead of the :class:`dict`
   :param int number_mode: enable particular behaviors in handling numbers
   :param int datetime_mode: how should :class:`datetime` and :class:`date` instances be
                             handled
   :param int uuid_mode: how should :class:`UUID` instances be handled
   :param int parse_mode: whether the parser should allow non-standard JSON extensions
   :param bool allow_nan: *compatibility* flag equivalent to ``number_mode=NM_NAN``
   :returns: An equivalent Python object.
   :raises ValueError: if an invalid argument is given
   :raises JSONDecodeError: if `string` is not a valid ``JSON`` value

   .. rubric:: `object_hook`

   `object_hook` may be used to inject a custom deserializer that can replace any
   :class:`dict` instance found in the JSON structure with a *derived* object instance:

   .. doctest::

      >>> class Point(object):
      ...   def __init__(self, x, y):
      ...     self.x = x
      ...     self.y = y
      ...   def __repr__(self):
      ...     return 'Point(%s, %s)' % (self.x, self.y)
      ...
      >>> def point_dejsonifier(d):
      ...   if 'x' in d and 'y' in d:
      ...     return Point(d['x'], d['y'])
      ...   else:
      ...     return d
      ...
      >>> loads('{"x":1,"y":2}', object_hook=point_dejsonifier)
      Point(1, 2)


   .. _loads-number-mode:
   .. rubric:: `number_mode`

   The `number_mode` argument selects different behaviors in handling numeric values.

   By default *non-numbers* (``nan``, ``inf``, ``-inf``) are recognized, because
   ``NM_NAN`` is *on* by default:

   .. doctest::

      >>> loads('[NaN, Infinity]')
      [nan, inf]
      >>> loads('[NaN, Infinity]', number_mode=NM_NAN)
      [nan, inf]

   Explicitly setting `number_mode` or using the compatibility option `allow_nan` you can
   avoid that and obtain a ``ValueError`` exception instead:

   .. doctest::

      >>> loads('[NaN, Infinity]', number_mode=NM_NATIVE)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      rapidjson.JSONDecodeError: … Out of range float values are not JSON compliant
      >>> loads('[NaN, Infinity]', allow_nan=False)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      rapidjson.JSONDecodeError: … Out of range float values are not JSON compliant

   Normally all floating point literals present in the JSON structure will be loaded as
   Python :class:`float` instances, with :data:`NM_DECIMAL` they will be returned as
   :class:`Decimal` instances instead:

   .. doctest::

      >>> loads('1.2345')
      1.2345
      >>> loads('1.2345', number_mode=NM_DECIMAL)
      Decimal('1.2345')

   When you can be sure that all the numeric values are constrained within the
   architecture's hardware limits you can get a sensible speed gain with the
   :data:`NM_NATIVE` flag. While this is quite faster, integer literals that do not fit
   into the underlying C library ``long long`` limits will be converted (*truncated*) to
   ``double`` numbers:

   .. doctest::

      >>> loads('123456789012345678901234567890')
      123456789012345678901234567890
      >>> loads('123456789012345678901234567890', number_mode=NM_NATIVE)
      1.2345678901234566e+29

   These flags can be combined together:

   .. doctest::

      >>> loads('[-1, NaN, 3.1415926535897932384626433832795028841971]',
      ...       number_mode=NM_DECIMAL | NM_NAN)
      [-1, Decimal('NaN'), Decimal('3.1415926535897932384626433832795028841971')]

   with the exception of :data:`NM_NATIVE` and :data:`NM_DECIMAL`, that does not make
   sense since there's little point in creating :class:`Decimal` instances out of possibly
   truncated float literals:

   .. doctest:

      >>> loads('3.1415926535897932384626433832795028841971')
      3.141592653589793
      >>> loads('3.1415926535897932384626433832795028841971',
      ...       number_mode=NM_NATIVE)
      3.141592653589793
      >>> loads('3.1415926535897932384626433832795028841971',
      ...       number_mode=NM_NATIVE | NM_DECIMAL)
      Traceback (most recent call last):
        ...
      ValueError: ... Combining NM_NATIVE with NM_DECIMAL is not supported


   .. _loads-datetime-mode:
   .. rubric:: `datetime_mode`

   With `datetime_mode` you can enable recognition of string literals containing an `ISO
   8601`_ representation as either :class:`date`, :class:`datetime` or :class:`time`
   instances:

   .. doctest::

      >>> loads('"2016-01-02T01:02:03+01:00"')
      '2016-01-02T01:02:03+01:00'
      >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=DM_ISO8601)
      datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...3600)))
      >>> loads('"2016-01-02T01:02:03-01:00"', datetime_mode=DM_ISO8601)
      datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...82800)))
      >>> loads('"2016-01-02"', datetime_mode=DM_ISO8601)
      datetime.date(2016, 1, 2)
      >>> loads('"01:02:03+01:00"', datetime_mode=DM_ISO8601)
      datetime.time(1, 2, 3, tzinfo=...delta(...3600)))

   It can be combined with :data:`DM_SHIFT_TO_UTC` to *always* obtain values in the UTC_
   timezone:

   .. doctest::

      >>> mode = DM_ISO8601 | DM_SHIFT_TO_UTC
      >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode)
      datetime.datetime(2016, 1, 2, 0, 2, 3, tzinfo=...utc)

   .. note::

      This option is somewhat limited when the value is a non-naïve time literal
      because negative values cannot be represented by the underlying Python
      type, so it cannot adapt such values reliably:

      .. doctest::

         >>> mode = DM_ISO8601 | DM_SHIFT_TO_UTC
         >>> loads('"00:01:02+00:00"', datetime_mode=mode)
         datetime.time(0, 1, 2, tzinfo=...utc)
         >>> loads('"00:01:02+01:00"', datetime_mode=mode)
         Traceback (most recent call last):
           ...
         ValueError: ... Time literal cannot be shifted to UTC: 00:01:02+01:00

   If you combine it with :data:`DM_NAIVE_IS_UTC` then all values without a timezone will
   be assumed to be relative to UTC_:

   .. doctest::

      >>> mode = DM_ISO8601 | DM_NAIVE_IS_UTC
      >>> loads('"2016-01-02T01:02:03"', datetime_mode=mode)
      datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...utc)
      >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode)
      datetime.datetime(2016, 1, 2, 1, 2, 3, tzinfo=...delta(...3600)))
      >>> loads('"01:02:03"', datetime_mode=mode)
      datetime.time(1, 2, 3, tzinfo=...utc)

   Yet another combination is with :data:`DM_IGNORE_TZ` to ignore the timezone and obtain
   naïve values:

   .. doctest::

      >>> mode = DM_ISO8601 | DM_IGNORE_TZ
      >>> loads('"2016-01-02T01:02:03+01:00"', datetime_mode=mode)
      datetime.datetime(2016, 1, 2, 1, 2, 3)
      >>> loads('"01:02:03+01:00"', datetime_mode=mode)
      datetime.time(1, 2, 3)

   .. _no-unix-time-loads:

   The :data:`DM_UNIX_TIME` cannot be used here, because there isn't a reasonable
   heuristic to disambiguate between plain numbers and timestamps:

   .. doctest::

      >>> loads('[1,2,3]', datetime_mode=DM_UNIX_TIME)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      ValueError: Invalid datetime_mode, can deserialize only from ISO8601


   .. _loads-uuid-mode:
   .. rubric:: `uuid_mode`

   With `uuid_mode` you can enable recognition of string literals containing two different
   representations of :class:`UUID` values:

   .. doctest::

      >>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"')
      'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa'
      >>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"',
      ...       uuid_mode=UM_CANONICAL)
      UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa')
      >>> loads('"aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"',
      ...       uuid_mode=UM_HEX)
      UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa')
      >>> loads('"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"',
      ...       uuid_mode=UM_CANONICAL)
      'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
      >>> loads('"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"',
      ...       uuid_mode=UM_HEX)
      UUID('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa')


   .. _loads-parse-mode:
   .. rubric:: `parse_mode`

   With `parse_mode` you can tell the parser to be *relaxed*, allowing either
   ``C++``/``JavaScript`` like comments (:data:`PM_COMMENTS`):

   .. doctest::

      >>> loads('"foo" // one line of explanation')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      rapidjson.JSONDecodeError: Parse error at offset 6: The document root must not be followed by other values.
      >>> loads('"bar" /* detailed explanation */')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      rapidjson.JSONDecodeError: Parse error at offset 6: The document root must not be followed by other values.
      >>> loads('"foo" // one line of explanation', parse_mode=PM_COMMENTS)
      'foo'
      >>> loads('"bar" /* detailed explanation */', parse_mode=PM_COMMENTS)
      'bar'

   or *trailing commas* at the end of arrays and objects (:data:`PM_TRAILING_COMMAS`):

   .. doctest::

      >>> loads('[1,]')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      rapidjson.JSONDecodeError: Parse error at offset 3: Invalid value.
      >>> loads('[1,]', parse_mode=PM_TRAILING_COMMAS)
      [1]
      >>> loads('{"one": 1,}', parse_mode=PM_TRAILING_COMMAS)
      {'one': 1}

   or both:

   .. doctest::

      >>> loads('[1, /* 2, */ 3,]')
      Traceback (most recent call last):
        ...
      rapidjson.JSONDecodeError: Parse error at offset 4: Invalid value.
      >>> loads('[1, /* 2, */ 3,]', parse_mode=PM_COMMENTS | PM_TRAILING_COMMAS)
      [1, 3]

.. _ISO 8601: https://en.wikipedia.org/wiki/ISO_8601
.. _RapidJSON: http://rapidjson.org/
.. _UTC: https://en.wikipedia.org/wiki/Coordinated_Universal_Time
.. _Unix time: https://en.wikipedia.org/wiki/Unix_time