File: KeyPaths.md

package info (click to toggle)
swiftlang 6.0.3-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,519,992 kB
  • sloc: cpp: 9,107,863; ansic: 2,040,022; asm: 1,135,751; python: 296,500; objc: 82,456; f90: 60,502; lisp: 34,951; pascal: 19,946; sh: 18,133; perl: 7,482; ml: 4,937; javascript: 4,117; makefile: 3,840; awk: 3,535; xml: 914; fortran: 619; cs: 573; ruby: 573
file content (306 lines) | stat: -rw-r--r-- 13,367 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
# Key Path Memory Layout

**Key path objects** are laid out at runtime as a heap object with a
variable-sized payload containing a sequence of encoded components describing
how the key path traverses a value. When the compiler sees a key path literal,
it generates a **key path pattern** that can be efficiently interpreted by
the runtime to instantiate a key path object when needed. This document
describes the layout of both. The key path pattern layout is designed in such a
way that it can be transformed in-place into a key path object with a one-time
initialization in the common case where the entire path is fully specialized
and crosses no resilience boundaries.

## ABI Concerns For Key Paths

For completeness, this document describes the layout of both key path objects
and patterns; note however that the instantiated runtime layout of key path
objects is an implementation detail of the Swift runtime, and *only key path
patterns* are strictly ABI, since they are emitted by the compiler. The
runtime has the freedom to change the runtime layout of key path objects, but
will have to maintain the ability to instantiate from key path patterns emitted
by previous ABI-stable versions of the Swift compiler.

## Key Path Objects

### Buffer Header

Key path objects begin with the standard Swift heap object header, followed by a
key path object header. Relative to the end of the heap object header:

Offset  | Description
------- | ----------------------------------------------
`0`       | Pointer to KVC compatibility C string, or null
`1*sizeof(Int)` | Key path buffer header (32 bits)

If the key path is Cocoa KVC-compatible, the first word will be a pointer to
the equivalent KVC string as a null-terminated UTF-8 C string. It will be null
otherwise.  The **key path buffer header** in the second word contains the
following bit fields:

Bits (LSB zero) | Description
--------------- | -----------
0...23          | **Buffer size** in bytes
24...29         | Reserved. Must be zero in Swift 4...5 runtime
30              | 1 = Has **reference prefix**, 0 = No reference prefix
31              | 1 = Is **trivial**, 0 = Has destructor

The *buffer size* indicates the total size in bytes of the components following
the key path buffer header. A `ReferenceWritableKeyPath` may have a *reference
prefix* of read-only components that can be projected before initiating
mutation; bit 30 is set if one is present. A key path may capture values that
require cleanup when the key path object is deallocated, but a key path that
does not capture any values with cleanups will have the *trivial* bit 31 set to
fast-path deallocation.

Components are always pointer-aligned, so the first component always starts at
offset `2*sizeof(Int)`. On 64-bit platforms, this leaves four bytes of padding.

### Components

After the buffer header, one or more **key path components** appear in memory
in sequence. Each component begins with a 32-bit **key path component header**
describing the following component.

Bits (LSB zero) | Description
--------------- | -----------
0...23          | **Payload** (meaning is dependent on component kind)
24...30         | **Component kind**
31              | 1 = **End of reference prefix**, 0 = Not end of reference prefix

If the key path has a *reference prefix*, then exactly one component must have
the *end of reference prefix* bit set in its component header. This indicates
that the component after the end of the reference prefix will initiate mutation.

The following *component kinds* are recognized:

Value in bits 24...30 | Description
--------------------- | -----------
0                     | Struct/tuple/self stored property
1                     | Computed
2                     | Class stored property
3                     | Optional chaining/forcing/wrapping

- A **struct stored property** component, when given
  a value of the base type in memory, can project the component value in-place
  at a fixed offset within the base value. This applies for struct stored
  properties, tuple fields, and the `.self` identity component (which trivially
  projects at offset zero). The
  *payload* contains the offset in bytes of the projected field in the
  aggregate, or the special value `0xFF_FFFF`, which indicates that the
  offset is too large to pack into the payload and is stored in the next 32 bits
  after the header.
- A **class stored property** component, when given a reference to a class
  instance, can project the component value inside the class instance at
  a fixed offset. The *payload*
  *payload* contains the offset in bytes of the projected field from the
  address point of the object, or the special value `0xFF_FFFF`, which
  indicates that the offset is too large to pack into the payload and is stored
  in the next 32 bits after the header.
- An **optional** component performs an operation involving `Optional` values.
  The `payload` contains one of the following values:

    Value in payload | Description
    ---------------- | -----------
    0                | **Optional chaining**
    1                | **Optional wrapping**
    2                | **Optional force-unwrapping**

    A *chaining* component behaves like the postfix `?` operator, immediately
    ending the key path application and returning nil when the base value is nil,
    or unwrapping the base value and continuing projection on the non-optional
    payload when non-nil. If an optional chain ends in a non-optional value,
    an implicit *wrapping* component is inserted to wrap it up in an
    optional value. A *force-unwrapping* operator behaves like the postfix
    `!` operator, trapping if the base value is nil, or unwrapping the value
    inside the optional if not.

- A **computed** component uses the conservative access pattern of `get`/`set`
  /`materializeForSet` to project from the base value. This is used as a
  general fallback component for any key path component without a more
  specialized representation, including not only computed properties but
  also subscripts, stored properties that require reabstraction, properties
  with behaviors or custom key path components (when we get those), and weak or
  unowned properties. The payload contains additional bitfields describing the
  component:

    Bits (LSB zero) | Description
    --------------- | -----------
    24              | 1 = **Has captured arguments**, 0 = no captures
    25...26         | **Identifier kind**
    27              | 1 = **Settable**, 0 = **Get-Only**
    28              | 1 = **Mutating** (implies settable), 0 = Nonmutating

    The component can *capture* context which is stored after the component in
    the key path object, such as generic arguments from its original context,
    subscript index arguments, and so on. Bit 24 is set if there are any such
    captures. Bits 25 and 26 discriminate the *identifier* which is used to
    determine equality of key paths referring to the same components. If
    bit 27 is set, then the key path is **settable** and can be written through,
    and bit 28 indicates whether the set operation **is mutating** to the base
    value, that is, whether setting through the component changes the base value
    like a value-semantics property or modifies state indirectly like a class
    property or `UnsafePointer.pointee`.

    After the header, the component contains the following word-aligned fields:

    Offset from header | Description
    ------------------ | -----------
    `1*sizeof(Int)`    | The **identifier** of the component.
    `2*sizeof(Int)`    | The **getter function** for the component.
    `3*sizeof(Int)`    | (if settable) The **setter function** for the component

    The combination of the identifier kind bits and the identifier word are
    compared by the `==` operation on two key paths to determine whether they
    are equivalent. Neither the kind bits nor the identifier word
    have any stable semantic meaning other than as unique identifiers.
    In practice, the compiler picks a stable unique artifact of the
    underlying declaration, such as the naturally-abstracted getter entry point
    for a computed property, the offset of a reabstracted stored property, or
    an Objective-C selector for an imported ObjC property, to identify the
    component. The identifier kind bits are used to discriminate
    possibly-overlapping domains.

    The getter function is a pointer to a Swift function with the signature
    `@convention(thin) (@in Base, UnsafeRawPointer) -> @out Value`. When
    the component is applied, the getter is invoked with a copy of the base
    value and is passed a pointer to the captured arguments of the
    component. If the component has no captures, the second argument is
    undefined.

    The setter function is also a pointer to a Swift function. This field is
    only present if the *settable* bit of the header is set. If the
    component is nonmutating, then the function has signature
    `@convention(thin) (@in Base, @in Value, UnsafeRawPointer) -> ()`,
    or if it is mutating, then the function has signature
    `@convention(thin) (@inout Base, @in Value, UnsafeRawPointer) -> ()`.
    When a mutating application of the key path is completed, the setter is
    invoked with a copy of the base value (if nonmutating) or a reference to
    the base value (if mutating), along with a copy of the updated component
    value, and a pointer to the captured arguments of the component. If
    the component has no captures, the third argument is undefined.

    TODO: Make getter/nonmutating setter take base borrowed,
    yield borrowed result (materializeForGet); use materializeForSet

    If the component has captures, the capture area appears after the other
    fields, at offset `3*sizeof(Int)` for a get-only component or
    `4*sizeof(Int)` for a settable component. The area begins with a two-word
    header:

    Offset from start | Description
    ----------------- | -----------
    `0`               | Size of captures in bytes
    `1*sizeof(Int)`   | Pointer to **argument witness table**

    followed by the captures themselves. The *argument witness table* contains
    pointers to functions needed for maintaining the captures:

    Offset           | Description
    ---------------- | -----------
    `0`              | **Destroy**, or null if trivial
    `1*sizeof(Int)`  | **Copy**
    `2*sizeof(Int)`  | **Is Equal**
    `3*sizeof(Int)`  | **Hash**

    The *destroy* function, if not null, has signature
    `@convention(thin) (UnsafeMutableRawPointer) -> ()` and is invoked to
    destroy the captures when the key path object is deallocated.

    The *copy* function has signature
    `@convention(thin) (_ src: UnsafeRawPointer,
                        _ dest: UnsafeMutableRawPointer) -> ()`
    and is invoked when the captures need to be copied into a new key path
    object, for example when two key paths are appended.

    The *is equal* function has signature
    `@convention(thin) (UnsafeRawPointer, UnsafeRawPointer) -> Bool`
    and is invoked when the component is compared for equality with another
    computed component with the same identifier.

    The *hash* function has signature
    `@convention(thin) (UnsafeRawPointer, UnsafeRawPointer) -> Int`
    and is invoked when the key path containing the component is hashed.
    The implementation understands a return value of zero to mean that the
    captures should have no effect on the hash value of the key path.

After every component except for the final component, a pointer-aligned
pointer to the metadata for the type of the projected component is stored.
(The type of the final component can be found from the `Value` generic
argument of the `KeyPath<Root, Value>` type.)

### Examples

Given:

```swift
struct A {
  var padding: (128 x UInt8)
  var b: B
}

class B {
  var padding: (240 x UInt8)
  var c: C
}

struct C {
  var padding: (384 x UInt8)
  var d: D
}
```

On a 64-bit platform, a key path object representing `\A.b.c.d` might look like
this in memory:

Word | Contents
---- | --------
0    | isa pointer to `ReferenceWritableKeyPath<A, D>`
1    | reference counts
`-`  | `-`
2    | buffer header 0xC000_0028 - trivial, reference prefix, buffer size 40
`-`  | `-`
3    | component header 0x8000_0080 - struct component, offset 128, end of prefix
4    | type metadata pointer for `B`
`-`  | `-`
5    | component header 0x4000_0100 - class component, offset 256
6    | type metadata pointer for `C`
`-`  | `-`
7    | component header 0x0000_0180 - struct component, offset 384

If we add:

```
struct D {
  var computed: E { get set }
}

struct E {
  subscript(b: B) -> F { get }
}
```

then `\D.e[B()]` would look like:

Word | Contents
---- | --------
0    | isa pointer to `WritableKeyPath<D, E>`
1    | reference counts
`-`  | `-`
2    | buffer header 0x0000_0058 - buffer size 88
`-`  | `-`
3    | component header 0x3800_0000 - computed, settable, mutating
4    | identifier pointer
5    | getter
6    | setter
7    | type metadata pointer for `F`
`-`  | `-`
8    | component header 0x2100_0000 - computed, has captures
9    | identifier pointer
10   | getter
11   | argument size 8
12   | pointer to argument witnesses for releasing/retaining/equating/hashing `B`
13   | value of `B()`

## Key Path Patterns

(to be written)