File: rfl_ref.md

package info (click to toggle)
reflect-cpp 0.21.0%2Bds-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 13,128 kB
  • sloc: cpp: 50,336; python: 139; makefile: 30; sh: 3
file content (240 lines) | stat: -rw-r--r-- 8,079 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
# `rfl::Box` and `rfl::Ref`

In previous sections, we have defined the `Person` class recursively:

```cpp
struct Person {
    rfl::Rename<"firstName", std::string> first_name;
    rfl::Rename<"lastName", std::string> last_name;
    std::vector<Person> children;
};
```

This works, because `std::vector` contains a pointer under-the-hood. But what wouldn't work is something like this:

```cpp
// WILL NOT COMPILE
struct Person {
    rfl::Rename<"firstName", std::string> first_name;
    rfl::Rename<"lastName", std::string> last_name;
    Person child;
};
```

This is because the compiler cannot figure out the intended size of the struct. But recursively defined structures
are important. For instance, if you deal with machine learning, you might be familiar with a decision tree.

A decision tree consists of a `Leaf` containing the prediction and a `Node` which splits the decision tree into
two subtrees.

A naive implementation might look like this:

```cpp
// WILL NOT COMPILE
struct DecisionTree {
    struct Leaf {
        using Tag = rfl::Literal<"Leaf">;
        double prediction;
    };

    struct Node {
        using Tag = rfl::Literal<"Node">;
        rfl::Rename<"criticalValue", double> critical_value;
        DecisionTree lesser;
        DecisionTree greater;
    };

    using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;

    rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```

Again, this will not compile, because the compiler cannot figure out the intended size of the struct.

A possible solution might be to use `std::unique_ptr`:

```cpp
// Will compile, but not an ideal design.
struct DecisionTree {
    struct Leaf {
        using Tag = rfl::Literal<"Leaf">;
        double prediction;
    };

    struct Node {
        using Tag = rfl::Literal<"Node">;
        rfl::Rename<"criticalValue", double> critical_value;
        std::unique_ptr<DecisionTree> lesser;
        std::unique_ptr<DecisionTree> greater;
    };

    using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;

    rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```

This will compile, but the design is less than ideal. We know for a fact that a `Node` must have
exactly two subtrees. But this is not reflected in the type system. In this encoding, the fields
"lesser" and "greater" are marked optional and you will have to check at runtime that they are indeed set.

But this violates the principles of reflection. Reflection is all about validating as much of our assumptions
upfront as we possibly can. For a great theoretical discussion of this topic, check out
[Parse, don't validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)
by Alexis King.

So how would we encode our assumptions that the fields "lesser" and "greater" must exist in the type system and
still have code that compiles? By using `rfl::Box` instead of `std::unique_ptr`:

```cpp
struct DecisionTree {
    struct Leaf {
        using Tag = rfl::Literal<"Leaf">;
        double prediction;
    };

    struct Node {
        using Tag = rfl::Literal<"Node">;
        rfl::Rename<"criticalValue", double> critical_value;
        rfl::Box<DecisionTree> lesser;
        rfl::Box<DecisionTree> greater;
    };

    using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;

    rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```

`rfl::Box` is a thin wrapper around `std::unique_ptr`, but it is guaranteed to **never be null** (unless you do something egregious such as trying to access it after calling `std::move`). It is a `std::unique_ptr` without the `nullptr`.

If you want to learn more about the evils of null references, check out the
[Null References: The Billion Dollar Mistake](https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/)
by Tony Hoare, who invented the concept in the first place.

You **must** initialize `rfl::Box` the moment you create it and it cannot be dereferenced until it is destroyed.

`rfl::Box` can be initialized using `rfl::make_box<...>(...)`, just like `std::make_unique<...>(...)`:

```cpp
auto leaf1 = DecisionTree::Leaf{.value = 3.0};

auto leaf2 = DecisionTree::Leaf{.value = 5.0};

auto node =
    DecisionTree::Node{.critical_value = 10.0,
                       .lesser = rfl::make_box<DecisionTree>(leaf1),
                       .greater = rfl::make_box<DecisionTree>(leaf2)};

const DecisionTree tree{.leaf_or_node = std::move(node)};

const auto json_string = rfl::json::write(tree);
```

This will result in the following JSON string:

```json
{"leafOrNode":{"type":"Node","criticalValue":10.0,"lesser":{"leafOrNode":{"type":"Leaf","value":3.0}},"greater":{"leafOrNode":{"type":"Leaf","value":5.0}}}}
```

You can also initialize `rfl::Box<T>` from a `std::unique_ptr<T>`:

```cpp
auto ptr = std::make_unique<std::string>("Hello World!");
const rfl::Result<rfl::Box<std::string>> box = rfl::make_box<std::string>(std::move(ptr));
```

Note that `box` is wrapped in a `Result`. That is, because we cannot guarantee at compile time
that `ptr` is not `nullptr`, therefore we need to account for that.

If you want to use reference-counted pointers, instead of unique pointers, you can use `rfl::Ref`.
`rfl::Ref` is the same concept as `rfl::Box`, but using `std::shared_ptr` under-the-hood.

```cpp
struct DecisionTree {
    struct Leaf {
        using Tag = rfl::Literal<"Leaf">;
        double value;
    };

    struct Node {
        using Tag = rfl::Literal<"Node">;
        rfl::Rename<"criticalValue", double> critical_value;
        rfl::Ref<DecisionTree> lesser;
        rfl::Ref<DecisionTree> greater;
    };

    using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;

    rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};

const auto leaf1 = DecisionTree::Leaf{.value = 3.0};

const auto leaf2 = DecisionTree::Leaf{.value = 5.0};

auto node =
    DecisionTree::Node{.critical_value = 10.0,
                       .lesser = rfl::make_ref<DecisionTree>(leaf1),
                       .greater = rfl::make_ref<DecisionTree>(leaf2)};

const DecisionTree tree{.leaf_or_node = std::move(node)};

const auto json_string = rfl::json::write(tree);
```

The resulting JSON string is identical:

```json
{"leafOrNode":{"type":"Node","criticalValue":10.0,"lesser":{"leafOrNode":{"type":"Leaf","value":3.0}},"greater":{"leafOrNode":{"type":"Leaf","value":5.0}}}}
```

## Deep Copying

The default `rfl::Box` implementation behaves the same as `std::unique_ptr` in relation to copying, disabling the copy assignment operator and the copy constructor.

An opt-in box implementation, `rfl::CopyableBox`, bypasses the `std::unique_ptr` operators and allows copying by calling
the contained type's copy constructor and copy assignment operator directly, but otherwise
behaves the same as `rfl::Box`.

When using `rfl::CopyableBox`, `rfl::make_box<...>(...)` must be replaced with `rfl::make_copyable_box<...>(...)`.

This allows for deep-copying of arbitrary-complexity types that contain nested recursive elements:

```cpp
struct DecisionTree {
    struct Leaf {
        using Tag = rfl::Literal<"Leaf">;
        double prediction;
    };

    struct Node {
        using Tag = rfl::Literal<"Node">;
        rfl::Rename<"criticalValue", double> critical_value;
        rfl::CopyableBox<DecisionTree> lesser;
        rfl::CopyableBox<DecisionTree> greater;
    };

    using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;

    rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};


auto leaf1 = DecisionTree::Leaf{.value = 3.0};

auto leaf2 = DecisionTree::Leaf{.value = 5.0};

auto node =
    DecisionTree::Node{.critical_value = 10.0,
                       .lesser = rfl::make_copyable_box<DecisionTree>(leaf1),
                       .greater = rfl::make_copyable_box<DecisionTree>(leaf2)};

const DecisionTree tree{.leaf_or_node = std::move(node)};

auto different_leaf = DecisionTree::Leaf{.value = 1.0};
DecisionTree copy = tree;

rfl::get<DecisionTree::Node>(copy.leaf_or_node.get().variant()).lesser = rfl::make_copyable_box<DecisionTree>(different_leaf);
```