1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266
|
!!! note
This section is part of the *internals* documentation, and is partly targeted to contributors.
Pydantic heavily relies on [type hints][type hint] at runtime to build schemas for validation, serialization, etc.
While type hints were primarily introduced for static type checkers (such as [Mypy] or [Pyright]), they are
accessible (and sometimes evaluated) at runtime. This means that the following would fail at runtime,
because `Node` has yet to be defined in the current module:
```python {test="skip" lint="skip"}
class Node:
"""Binary tree node."""
# NameError: name 'Node' is not defined:
def __init__(self, l: Node, r: Node) -> None:
self.left = l
self.right = r
```
To circumvent this issue, forward references can be used (by wrapping the annotation in quotes).
In Python 3.7, [PEP 563] introduced the concept of *postponed evaluation of annotations*, meaning
with the `from __future__ import annotations` [future statement], type hints are stringified by default:
```python {requires="3.12" lint="skip"}
from __future__ import annotations
from pydantic import BaseModel
class Foo(BaseModel):
f: MyType
# Given the future import above, this is equivalent to:
# f: 'MyType'
type MyType = int
print(Foo.__annotations__)
#> {'f': 'MyType'}
```
## The challenges of runtime evaluation
Static type checkers make use of the <abbr title="Abstract Syntax Tree">AST</abbr> to analyze the defined annotations.
Regarding the previous example, this has the benefit of being able to understand what `MyType` refers to when analyzing
the class definition of `Foo`, even if `MyType` isn't yet defined at runtime.
However, for runtime tools such as Pydantic, it is more challenging to correctly resolve these forward annotations.
The Python standard library provides some tools to do so ([`typing.get_type_hints()`][typing.get_type_hints],
[`inspect.get_annotations()`][inspect.get_annotations]), but they come with some limitations. Thus, they are
being re-implemented in Pydantic with improved support for edge cases.
As Pydantic as grown, it's adapted to support many edge cases requiring irregular patterns for annotation evaluation.
Some of these use cases aren't necessarily sound from a static type checking perspective. In v2.10, the internal
logic was refactored in an attempt to simplify and standardize annotation evaluation. Admittedly, backwards compatibility
posed some challenges, and there is still some noticeable scar tissue in the codebase because of this.There's a hope that
[PEP 649] (introduced in Python 3.14) will greatly simplify the process, especially when it comes to dealing with locals
of a function.
To evaluate forward references, Pydantic roughly follows the same logic as described in the documentation of the
[`typing.get_type_hints()`][typing.get_type_hints] function. That is, the built-in [`eval()`][eval] function is used
by passing the forward reference, a global, and a local namespace. The namespace fetching logic is defined in the
sections below.
## Resolving annotations at class definition
The following example will be used as a reference throughout this section:
```python {test="skip" lint="skip"}
# module1.py:
type MyType = int
class Base:
f1: 'MyType'
# module2.py:
from pydantic import BaseModel
from module1 import Base
type MyType = str
def inner() -> None:
type InnerType = bool
class Model(BaseModel, Base):
type LocalType = bytes
f2: 'MyType'
f3: 'InnerType'
f4: 'LocalType'
f5: 'UnknownType'
type InnerType2 = complex
```
When the `Model` class is being built, different [namespaces][namespace] are at play. For each base class
of the `Model`'s [MRO][method resolution order] (in reverse order — that is, starting with `Base`), the
following logic is applied:
1. Fetch the `__annotations__` key from the current base class' `__dict__`, if present. For `Base`, this will be
`{'f1': 'MyType'}`.
2. Iterate over the `__annotations__` items and try to evaluate the annotation [^1] using a custom wrapper around
the built-in [`eval()`][eval] function. This function takes two `globals` and `locals` arguments:
* The current module's `__dict__` is naturally used as `globals`. For `Base`, this will be
`sys.modules['module1'].__dict__`.
* For the `locals` argument, Pydantic will try to resolve symbols in the following namespaces, sorted by highest priority:
* A namespace created on the fly, containing the current class name (`{cls.__name__: cls}`). This is done
in order to support recursive references.
* The locals of the current class (i.e. `cls.__dict__`). For `Model`, this will include `LocalType`.
* The parent namespace of the class, if different from the globals described above. This is the
[locals][frame.f_locals] of the frame where the class is being defined. For `Base`, because the class is being
defined in the module directly, this namespace won't be used as it will result in the globals being used again.
For `Model`, the parent namespace is the locals of the frame of `inner()`.
3. If the annotation failed to evaluate, it is kept as is, so that the model can be rebuilt at a later stage. This will
be the case for `f5`.
The following table lists the resolved type annotations for every field, once the `Model` class has been created:
| Field name | Resolved annotation |
|------------|---------------------|
| `f1` | [`int`][] |
| `f2` | [`str`][] |
| `f3` | [`bool`][] |
| `f4` | [`bytes`][] |
| `f5` | `'UnknownType'` |
### Limitations and backwards compatibility concerns
While the namespace fetching logic is trying to be as accurate as possible, we still face some limitations:
<div class="annotate" markdown>
* The locals of the current class (`cls.__dict__`) may include irrelevant entries, most of them being dunder attributes.
This means that the following annotation: `f: '__doc__'` will successfully (and unexpectedly) be resolved.
* When the `Model` class is being created inside a function, we keep a copy of the [locals][frame.f_locals] of the frame.
This copy only includes the symbols defined in the locals when `Model` is being defined, meaning `InnerType2` won't be included
(and will **not be** if doing a model rebuild at a later point!).
* To avoid memory leaks, we use [weak references][weakref] to the locals of the function, meaning some forward references might
not resolve outside the function (1).
* Locals of the function are only taken into account for Pydantic models, but this pattern does not apply to dataclasses, typed
dictionaries or named tuples.
</div>
1. Here is an example:
```python {test="skip" lint="skip"}
def func():
A = int
class Model(BaseModel):
f: 'A | Forward'
return Model
Model = func()
Model.model_rebuild(_types_namespace={'Forward': str})
# pydantic.errors.PydanticUndefinedAnnotation: name 'A' is not defined
```
For backwards compatibility reasons, and to be able to support valid use cases without having to rebuild models,
the namespace logic described above is a bit different when it comes to core schema generation.
Taking the following example:
{#backwards-compatibility-logic}
```python
from dataclasses import dataclass
from pydantic import BaseModel
@dataclass
class Foo:
a: 'Bar | None' = None
class Bar(BaseModel):
b: Foo
```
Once the fields for `Bar` have been collected (meaning annotations resolved), the `GenerateSchema` class converts
every field into a core schema. When it encounters another class-like field type (such as a dataclass), it will
try to evaluate annotations, following roughly the same logic as [described above](#resolving-annotations-at-class-definition).
However, to evaluate the `'Bar | None'` annotation, `Bar` needs to be present in the globals or locals, which is normally
*not* the case: `Bar` is being created, so it is not "assigned" to the current module's `__dict__` at that point.
To avoid having to call [`model_rebuild()`][pydantic.BaseModel.model_rebuild] on `Bar`, both the parent namespace
(if `Bar` was to be defined inside a function, and [the namespace provided during a model rebuild](#model-rebuild-semantics))
and the `{Bar.__name__: Bar}` namespace are included in the locals during annotations evaluation of `Foo`
(with the lowest priority) (1).
{ .annotate }
1. This backwards compatibility logic can introduce some inconsistencies, such as the following:
```python {lint="skip"}
from dataclasses import dataclass
from pydantic import BaseModel
@dataclass
class Foo:
# `a` and `b` shouldn't resolve:
a: 'Model'
b: 'Inner'
def func():
Inner = int
class Model(BaseModel):
foo: Foo
Model.__pydantic_complete__
#> True, should be False.
```
## Resolving annotations when rebuilding a model
When a forward reference fails to evaluate, Pydantic will silently fail and stop the core schema
generation process. This can be seen by inspecting the `__pydantic_core_schema__` of a model class:
```python {lint="skip"}
from pydantic import BaseModel
class Foo(BaseModel):
f: 'MyType'
Foo.__pydantic_core_schema__
#> <pydantic._internal._mock_val_ser.MockCoreSchema object at 0x73cd0d9e6d00>
```
If you then properly define `MyType`, you can rebuild the model:
```python {test="skip" lint="skip"}
type MyType = int
Foo.model_rebuild()
Foo.__pydantic_core_schema__
#> {'type': 'model', 'schema': {...}, ...}
```
The [`model_rebuild()`][pydantic.BaseModel.model_rebuild] method uses a *rebuild namespace*, with the following semantics:
{#model-rebuild-semantics}
* If an explicit `_types_namespace` argument is provided, it is used as the rebuild namespace.
* If no namespace is provided, the namespace where the method is called will be used as the rebuild namespace.
This *rebuild namespace* will be merged with the model's parent namespace (if it was defined in a function) and used as is
(see the [backwards compatibility logic](#backwards-compatibility-logic) described above).
[Mypy]: https://www.mypy-lang.org/
[Pyright]: https://github.com/microsoft/pyright/
[PEP 563]: https://peps.python.org/pep-0563/
[PEP 649]: https://peps.python.org/pep-0649/
[future statement]: https://docs.python.org/3/reference/simple_stmts.html#future
[^1]: This is done unconditionally, as forward annotations can be only present *as part* of a type hint (e.g. `Optional['int']`), as dictated by
the [typing specification](https://typing.readthedocs.io/en/latest/spec/annotations.html#string-annotations).
|