1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
|
# `rfl::Box` and `rfl::Ref`
In previous sections, we have defined the `Person` class recursively:
```cpp
struct Person {
rfl::Rename<"firstName", std::string> first_name;
rfl::Rename<"lastName", std::string> last_name;
std::vector<Person> children;
};
```
This works, because `std::vector` contains a pointer under-the-hood. But what wouldn't work is something like this:
```cpp
// WILL NOT COMPILE
struct Person {
rfl::Rename<"firstName", std::string> first_name;
rfl::Rename<"lastName", std::string> last_name;
Person child;
};
```
This is because the compiler cannot figure out the intended size of the struct. But recursively defined structures
are important. For instance, if you deal with machine learning, you might be familiar with a decision tree.
A decision tree consists of a `Leaf` containing the prediction and a `Node` which splits the decision tree into
two subtrees.
A naive implementation might look like this:
```cpp
// WILL NOT COMPILE
struct DecisionTree {
struct Leaf {
using Tag = rfl::Literal<"Leaf">;
double prediction;
};
struct Node {
using Tag = rfl::Literal<"Node">;
rfl::Rename<"criticalValue", double> critical_value;
DecisionTree lesser;
DecisionTree greater;
};
using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;
rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```
Again, this will not compile, because the compiler cannot figure out the intended size of the struct.
A possible solution might be to use `std::unique_ptr`:
```cpp
// Will compile, but not an ideal design.
struct DecisionTree {
struct Leaf {
using Tag = rfl::Literal<"Leaf">;
double prediction;
};
struct Node {
using Tag = rfl::Literal<"Node">;
rfl::Rename<"criticalValue", double> critical_value;
std::unique_ptr<DecisionTree> lesser;
std::unique_ptr<DecisionTree> greater;
};
using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;
rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```
This will compile, but the design is less than ideal. We know for a fact that a `Node` must have
exactly two subtrees. But this is not reflected in the type system. In this encoding, the fields
"lesser" and "greater" are marked optional and you will have to check at runtime that they are indeed set.
But this violates the principles of reflection. Reflection is all about validating as much of our assumptions
upfront as we possibly can. For a great theoretical discussion of this topic, check out
[Parse, don't validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)
by Alexis King.
So how would we encode our assumptions that the fields "lesser" and "greater" must exist in the type system and
still have code that compiles? By using `rfl::Box` instead of `std::unique_ptr`:
```cpp
struct DecisionTree {
struct Leaf {
using Tag = rfl::Literal<"Leaf">;
double prediction;
};
struct Node {
using Tag = rfl::Literal<"Node">;
rfl::Rename<"criticalValue", double> critical_value;
rfl::Box<DecisionTree> lesser;
rfl::Box<DecisionTree> greater;
};
using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;
rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
```
`rfl::Box` is a thin wrapper around `std::unique_ptr`, but it is guaranteed to **never be null** (unless you do something egregious such as trying to access it after calling `std::move`). It is a `std::unique_ptr` without the `nullptr`.
If you want to learn more about the evils of null references, check out the
[Null References: The Billion Dollar Mistake](https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/)
by Tony Hoare, who invented the concept in the first place.
You **must** initialize `rfl::Box` the moment you create it and it cannot be dereferenced until it is destroyed.
`rfl::Box` can be initialized using `rfl::make_box<...>(...)`, just like `std::make_unique<...>(...)`:
```cpp
auto leaf1 = DecisionTree::Leaf{.value = 3.0};
auto leaf2 = DecisionTree::Leaf{.value = 5.0};
auto node =
DecisionTree::Node{.critical_value = 10.0,
.lesser = rfl::make_box<DecisionTree>(leaf1),
.greater = rfl::make_box<DecisionTree>(leaf2)};
const DecisionTree tree{.leaf_or_node = std::move(node)};
const auto json_string = rfl::json::write(tree);
```
This will result in the following JSON string:
```json
{"leafOrNode":{"type":"Node","criticalValue":10.0,"lesser":{"leafOrNode":{"type":"Leaf","value":3.0}},"greater":{"leafOrNode":{"type":"Leaf","value":5.0}}}}
```
You can also initialize `rfl::Box<T>` from a `std::unique_ptr<T>`:
```cpp
auto ptr = std::make_unique<std::string>("Hello World!");
const rfl::Result<rfl::Box<std::string>> box = rfl::make_box<std::string>(std::move(ptr));
```
Note that `box` is wrapped in a `Result`. That is, because we cannot guarantee at compile time
that `ptr` is not `nullptr`, therefore we need to account for that.
If you want to use reference-counted pointers, instead of unique pointers, you can use `rfl::Ref`.
`rfl::Ref` is the same concept as `rfl::Box`, but using `std::shared_ptr` under-the-hood.
```cpp
struct DecisionTree {
struct Leaf {
using Tag = rfl::Literal<"Leaf">;
double value;
};
struct Node {
using Tag = rfl::Literal<"Node">;
rfl::Rename<"criticalValue", double> critical_value;
rfl::Ref<DecisionTree> lesser;
rfl::Ref<DecisionTree> greater;
};
using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;
rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
const auto leaf1 = DecisionTree::Leaf{.value = 3.0};
const auto leaf2 = DecisionTree::Leaf{.value = 5.0};
auto node =
DecisionTree::Node{.critical_value = 10.0,
.lesser = rfl::make_ref<DecisionTree>(leaf1),
.greater = rfl::make_ref<DecisionTree>(leaf2)};
const DecisionTree tree{.leaf_or_node = std::move(node)};
const auto json_string = rfl::json::write(tree);
```
The resulting JSON string is identical:
```json
{"leafOrNode":{"type":"Node","criticalValue":10.0,"lesser":{"leafOrNode":{"type":"Leaf","value":3.0}},"greater":{"leafOrNode":{"type":"Leaf","value":5.0}}}}
```
## Deep Copying
The default `rfl::Box` implementation behaves the same as `std::unique_ptr` in relation to copying, disabling the copy assignment operator and the copy constructor.
An opt-in box implementation, `rfl::CopyableBox`, bypasses the `std::unique_ptr` operators and allows copying by calling
the contained type's copy constructor and copy assignment operator directly, but otherwise
behaves the same as `rfl::Box`.
When using `rfl::CopyableBox`, `rfl::make_box<...>(...)` must be replaced with `rfl::make_copyable_box<...>(...)`.
This allows for deep-copying of arbitrary-complexity types that contain nested recursive elements:
```cpp
struct DecisionTree {
struct Leaf {
using Tag = rfl::Literal<"Leaf">;
double prediction;
};
struct Node {
using Tag = rfl::Literal<"Node">;
rfl::Rename<"criticalValue", double> critical_value;
rfl::CopyableBox<DecisionTree> lesser;
rfl::CopyableBox<DecisionTree> greater;
};
using LeafOrNode = rfl::TaggedUnion<"type", Leaf, Node>;
rfl::Field<"leafOrNode", LeafOrNode> leaf_or_node;
};
auto leaf1 = DecisionTree::Leaf{.value = 3.0};
auto leaf2 = DecisionTree::Leaf{.value = 5.0};
auto node =
DecisionTree::Node{.critical_value = 10.0,
.lesser = rfl::make_copyable_box<DecisionTree>(leaf1),
.greater = rfl::make_copyable_box<DecisionTree>(leaf2)};
const DecisionTree tree{.leaf_or_node = std::move(node)};
auto different_leaf = DecisionTree::Leaf{.value = 1.0};
DecisionTree copy = tree;
rfl::get<DecisionTree::Node>(copy.leaf_or_node.get().variant()).lesser = rfl::make_copyable_box<DecisionTree>(different_leaf);
```
|