File: advanced.md

package info (click to toggle)
python-jsonpath 2.0.2-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 1,028 kB
  • sloc: python: 9,473; makefile: 6
file content (259 lines) | stat: -rw-r--r-- 10,040 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
# Advanced Usage

## Filter Variables

Arbitrary variables can be made available to [filter selectors](syntax.md#filter-selector) using the `filter_context` argument to [`findall()`](quickstart.md#findallpath-data) and [`finditer()`](quickstart.md#finditerpath-data). `filter_context` should be a [mapping](https://docs.python.org/3/library/typing.html#typing.Mapping) of strings to JSON-like objects, like lists, dictionaries, strings and integers.

Filter context variables are selected using a filter query starting with the _filter context identifier_, which defaults to `_` and has usage similar to `$` and `@`.

```python
import jsonpath

data = {
    "users": [
        {
            "name": "Sue",
            "score": 100,
        },
        {
            "name": "John",
            "score": 86,
        },
        {
            "name": "Sally",
            "score": 84,
        },
        {
            "name": "Jane",
            "score": 55,
        },
    ]
}

user_names = jsonpath.findall(
    "$.users[?@.score < _.limit].name",
    data,
    filter_context={"limit": 100},
)
```

## Function Extensions

Add, remove or replace [filter functions](functions.md) by updating the [`function_extensions`](api.md#jsonpath.JSONPathEnvironment.function_extensions) attribute of a [`JSONPathEnvironment`](api.md#jsonpath.JSONPathEnvironment). It is a regular Python dictionary mapping filter function names to any [callable](https://docs.python.org/3/library/typing.html#typing.Callable), like a function or class with a `__call__` method.

### Type System for Function Expressions

[Section 2.4.1](https://datatracker.ietf.org/doc/html/rfc9535#name-type-system-for-function-ex) of RFC 9535 defines a type system for function expressions and requires that we check that filter expressions are well-typed. With that in mind, you are encouraged to implement custom filter functions by extending [`jsonpath.function_extensions.FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), which forces you to be explicit about the [types](api.md#jsonpath.function_extensions.ExpressionType) of arguments the function extension accepts and the type of its return value.

!!! info

    [`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction) was new in Python JSONPath version 0.10.0. Prior to that we did not enforce function expression well-typedness. To use any arbitrary [callable](https://docs.python.org/3/library/typing.html#typing.Callable) as a function extension - or if you don't want built-in filter functions to raise a `JSONPathTypeError` for function expressions that are not well-typed - set [`well_typed`](api.md#jsonpath.JSONPathEnvironment.well_typed) to `False` when constructing a [`JSONPathEnvironment`](api.md#jsonpath.JSONPathEnvironment).

### Example

As an example, we'll add a `min()` filter function, which will return the minimum of a sequence of values. If any of the values are not comparable, we'll return the special `undefined` value instead.

```python
from typing import Iterable

import jsonpath
from jsonpath.function_extensions import ExpressionType
from jsonpath.function_extensions import FilterFunction


class MinFilterFunction(FilterFunction):
    """A JSONPath function extension returning the minimum of a sequence."""

    arg_types = [ExpressionType.VALUE]
    return_type = ExpressionType.VALUE

    def __call__(self, value: object) -> object:
        if not isinstance(value, Iterable):
            return jsonpath.UNDEFINED

        try:
            return min(value)
        except TypeError:
            return jsonpath.UNDEFINED


env = jsonpath.JSONPathEnvironment()
env.function_extensions["min"] = MinFilterFunction()

example_data = {"foo": [{"bar": [4, 5]}, {"bar": [1, 5]}]}
print(env.findall("$.foo[?min(@.bar) > 1]", example_data))
```

Now, when we use `env.findall()`, `env.finditer()` or `env.compile()`, our `min` function will be available for use in filter expressions.

```text
$..products[?@.price == min($..products.price)]
```

### Built-in Functions

The [built-in functions](functions.md) can be removed from a `JSONPathEnvironment` by deleting the entry from `function_extensions`.

```python
import jsonpath

env = jsonpath.JSONPathEnvironment()
del env.function_extensions["keys"]
```

Or aliased with an additional entry.

```python
import jsonpath

env = jsonpath.JSONPathEnvironment()
env.function_extensions["properties"] = env.function_extensions["keys"]
```

Alternatively, you could subclass `JSONPathEnvironment` and override the `setup_function_extensions` method.

```python
from typing import Iterable
import jsonpath

class MyEnv(jsonpath.JSONPathEnvironment):
    def setup_function_extensions(self) -> None:
        super().setup_function_extensions()
        self.function_extensions["properties"] = self.function_extensions["keys"]
        self.function_extensions["min"] = min_filter


def min_filter(obj: object) -> object:
    if not isinstance(obj, Iterable):
        return jsonpath.UNDEFINED

    try:
        return min(obj)
    except TypeError:
        return jsonpath.UNDEFINED

env = MyEnv()
```

### Compile Time Validation

Calls to [type-aware](#type-system-for-function-expressions) function extension are validated at JSONPath compile-time automatically. If [`well_typed`](api.md#jsonpath.JSONPathEnvironment.well_typed) is set to `False` or a custom function extension does not inherit from [`FilterFunction`](api.md#jsonpath.function_extensions.FilterFunction), its arguments can be validated by implementing the function as a class with a `__call__` method, and a `validate` method. `validate` will be called after parsing the function, giving you the opportunity to inspect its arguments and raise a `JSONPathTypeError` should any arguments be unacceptable. If defined, `validate` must take a reference to the current environment, an argument list and the token pointing to the start of the function call.

```python
def validate(
        self,
        env: JSONPathEnvironment,
        args: List[FilterExpression],
        token: Token,
) -> List[FilterExpression]:
```

It should return an argument list, either the same as the input argument list, or a modified version of it. See the implementation of the built-in [`match` function](https://github.com/jg-rp/python-jsonpath/blob/main/jsonpath/function_extensions/match.py) for an example.

## Custom Environments

Python JSONPath can be customized by subclassing [`JSONPathEnvironment`](api.md#jsonpath.JSONPathEnvironment) and overriding class attributes and/or methods. Then using `findall()`, `finditer()` and `compile()` methods of that subclass.

### Identifier Tokens

The default identifier tokens, like `$` and `@`, can be changed by setting attributes on a `JSONPathEnvironment`. This example sets the root token (default `$`) to be `^`.

```python
import JSONPathEnvironment

class MyJSONPathEnvironment(JSONPathEnvironment):
    root_token = "^"


data = {
    "users": [
        {"name": "Sue", "score": 100},
        {"name": "John", "score": 86},
        {"name": "Sally", "score": 84},
        {"name": "Jane", "score": 55},
    ],
    "limit": 100,
}

env = MyJSONPathEnvironment()
user_names = env.findall(
    "^.users[?@.score < ^.limit].name",
    data,
)
```

This table shows all available identifier token attributes.

| attribute            | default |
| -------------------- | ------- |
| filter_context_token | `_`     |
| keys_token           | `#`     |
| root_token           | `$`     |
| self_token           | `@`     |

### Logical Operator Tokens

By default, we accept both Python and C-style logical operators in filter expressions. That is, `not` and `!` are equivalent, `and` and `&&` are equivalent and `or` and `||` are equivalent. You can change this using class attributes on a [`Lexer`](custom_api.md#jsonpath.lex.Lexer) subclass and setting the `lexer_class` attribute on a `JSONPathEnvironment`.

This example changes all three logical operators to strictly match the JSONPath spec.

```python
from jsonpath import JSONPathEnvironment
from jsonpath import Lexer

class MyLexer(Lexer):
    logical_not_pattern = r"!"
    logical_and_pattern = r"&&"
    logical_or_pattern = r"\|\|"

class MyJSONPathEnvironment(JSONPathEnvironment):
    lexer_class = MyLexer

env = MyJSONPathEnvironment()
env.compile("$.foo[?@.a > 0 && @.b < 100]")  # OK
env.compile("$.foo[?@.a > 0 and @.b < 100]")  # JSONPathSyntaxError
```

### Keys Selector

The non-standard keys selector is used to retrieve the keys/properties from a JSON Object or Python mapping. It defaults to `~` and can be changed using the `keys_selector_token` attribute on a [`JSONPathEnvironment`](./api.md#jsonpath.JSONPathEnvironment) subclass.

This example changes the keys selector to `*~`.

```python
from jsonpath import JSONPathEnvironment

class MyJSONPathEnvironment(JSONPathEnvironment):
    keys_selector_token = "*~"

data = {
    "users": [
        {"name": "Sue", "score": 100},
        {"name": "John", "score": 86},
        {"name": "Sally", "score": 84},
        {"name": "Jane", "score": 55},
    ],
    "limit": 100,
}

env = MyJSONPathEnvironment()
print(env.findall("$.users[0].*~", data))  # ['name', 'score']
```

### Array Index Limits

Python JSONPath limits the minimum and maximum JSON array or Python sequence indices (including slice steps) allowed in a JSONPath query. The default minimum allowed index is set to `-(2**53) + 1`, and the maximum to `(2**53) - 1`. When a limit is reached, a `JSONPathIndexError` is raised.

You can change the minimum and maximum allowed indices using the `min_int_index` and `max_int_index` attributes on a [`JSONPathEnvironment`](./api.md#jsonpath.JSONPathEnvironment) subclass.

```python
from jsonpath import JSONPathEnvironment

class MyJSONPathEnvironment(JSONPathEnvironment):
    min_int_index = -100
    max_int_index = 100

env = MyJSONPathEnvironment()
query = env.compile("$.users[999]")
# jsonpath.exceptions.JSONPathIndexError: index out of range, line 1, column 8
```