1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312
|
# CDDL Compiler
This CDDL compiler takes a CDDL specification as input and produces a C++ header
and source file which contain structs, enums, encode functions, and decode
functions. This simplifies the process of taking a CDDL message definition from
the OSP control protocol spec and making it usable in C++. Additionally, it
simplifies adding new messages or changing existing messages during development.
This compiler is not intended to support all or even most of the CDDL spec.
CDDL allows many patterns that are not useful, practical, or efficient when
considering a C++ implementation of CDDL messages. Our specialization for enums
is a good example, but more details are given below.
## Usage Overview
This section gives some examples of CDDL syntax that is supported and what the
generated C++ looks like. For the complete set of messages currently supported
for OSP, see [//msgs/osp.cddl](../../msgs/osp.cddl).
### Maps
The following example shows a map in CDDL:
``` cddl
x = {
alpha: uint,
beta: text,
}
```
This translates into a normal C++ struct (i.e. not `std::map`):
``` c++
struct X {
uint64_t alpha;
std::string beta;
};
```
The string keys are handled only by the encoding and decoding functions.
### Heterogenous Arrays
An array of heterogeneous (or indeed a fixed number of homogeneous types) such
as
``` cddl
x = [
alpha: uint,
beta: text,
]
```
also translates into a plain C++ struct.
``` c++
struct X {
uint64_t alpha;
std::string beta;
};
```
In the array case, the field keys are only used as variable names and no strings
are used in encoding.
Because these must be implemented as a C++ struct and we don't want to define an
automatic naming scheme, all array fields must have a key. For example, CDDL
would allow this definition:
``` cddl
x = [
uint,
text,
]
```
but this is not allowed by our compiler.
### Homogeneous Arrays
An array of unspecified length containing only one type:
``` cddl
x = [* uint]
```
is translated to a `std::vector`. In this case, a key for the single array
field isn't necessary. It's currently not supported to put length constraints
(e.g. `x = [2*5 uint]`) on the array length.
### Group Inclusion
If common fields are placed in a separate CDDL group (which is not a map or
array), it can be included directly in another map, array, or group type. So
``` cddl
x = (alpha: uint)
y = {
x,
beta: text,
}
```
will translate to the following C++ struct:
``` c++
struct Y {
uint64_t alpha;
std::string beta;
};
```
If you prefer that a group is included explicitly as its own struct type, you
should make it a map or array. For example,
``` cddl
x = {alpha: uint}
y = {
x: x,
beta: text,
}
```
will translate to the following C++ struct:
``` c++
struct X {
uint64_t alpha;
};
struct Y {
X x;
std::string beta;
};
```
### Optional Fields
Fields that are not required are prefixed with a '?' in CDDL.
``` cddl
x = { ? alpha: uint }
```
These are translated to a bool flag and value pair:
``` c++
struct X {
bool has_alpha;
uint64_t alpha;
};
```
### Choice from a Group as an Enum
CDDL allows specifying a type as one of any member of a group:
``` cddl
x = &(
alpha: 0,
beta: 1,
)
```
This is implemented as an enum in C++:
``` c++
enum X {
kAlpha = 0,
kBeta = 1,
};
```
Recursive group inclusion in choices handled by the simple fact that plain enum
constants are global and not type-checked. This leads to a global definition
caveat that is explained below, but here is an example of such an inclusion:
``` cddl
x = ( alpha: 0, beta: 1 )
y = &( x, gamma: 2 )
```
``` c++
enum X {
kAlpha = 0,
kBeta = 1,
};
enum Y {
// union: enum X
kGamma = 2,
};
```
### Type Choice as a Discriminated Union
Specifying multiple possible types for a value in CDDL
``` cddl
x = { alpha: text / uint }
```
is translated to a discriminated union in C++:
``` c++
struct X {
X();
~X(); // NOTE: This requires defining a ctor/dtor to deal with the union.
enum class WhichAlpha {
kString,
kUint64,
} which_alpha;
union {
std::string str;
uint64_t uint;
} alpha;
};
```
Currently, only `uint`, `text`, and `bytes` are allowed here. Additionally, as
an implementation note, a placeholder `bool` is also included in the union so it
can always be created as "uninitialized". This means that no destructor is
necessary before the first proper member assignment.
### Tagged Types
This example
``` cddl
x = #6.1234(uint)
```
translates to a single `uint64_t` variable. The 1234 tag is placed before it
during encoding and the same tag is checked during decoding.
### Caveats
In addition to completely unsupported aspects of CDDL, there are some places
where there are additional constraints placed on accepted CDDL forms. The
following sections describe these additional constraints.
#### Naming
CDDL allows identifiers to use characters from the set `[a-zA-Z0-9_-@$.]`, but
these do not correspond to valid C++ identifiers or typenames. As a result, we
need to either restrict the CDDL identifier character set or define a mapping to
C++ identifiers and typenames. We chose the latter, since CDDL prefers '-' over
'\_'. The mapping to C++ identifiers is done by converting '-' to '\_' and the
mapping to C++ typenames is done by converting to camel case on words delimited
by '-'. As a result, `[@$.]` are still disallowed in CDDL identifiers.
Additionally, the names `dead_beef` and `dead-beef` would translate to the same
C++ identifier/typename.
#### Enums
In order to simplify the sharing of enumeration values across messages (see
example below), they are implemented in C++ as enums and not enum classes. As a
result, the enum constant names are global and cannot be defined more than once.
The example below illustrates how to handle a case where you have odd enum set
intersections.
``` cddl
result = (
success: 0,
timeout: 1,
unknown-error: 2,
)
message1 = {
result: &(
result,
invalid-input: 10,
internal-error: 20,
)
}
message2 = {
result: &(
result,
invalid-input: 10, ; ERROR - redefinition of enum constant in resulting C++
cancelled: 30,
)
}
```
``` cddl
result = (
success: 0,
timeout: 1,
unknown-error: 2,
)
invalid-input = (
invalid-input: 10,
)
message1 = {
result: &(
result,
invalid-input,
internal-error: 20,
)
}
message2 = {
result: &(
result,
invalid-input, ; OK - reference existing enum in resulting C++
cancelled: 30,
)
}
```
As a corollary, care should be taken to not allow duplicate enum constant
_values_ in enums that are used together.
**TODO(btolsch): Make this a compiler check.**
## Implementation Overview
The implementation is broken up into the following files:
- [main.cc](main.cc): Compiler driver. Command line arguments are:
- `--header <filename>`: Specify the filename of the output header file.
This is also the name that will be used for the include guard and as the
include path in the source file.
- `--cc <filename>`: Specify the filename of the output source file.
- `--gen-dir <filename>`: Specify the directory prefix that should be added
to the output header and source file.
- A filename (in any position) without a preceding flag specifies the input
file which contains the CDDL spec.
- [cddl.py](cddl.py): Python adapter to allow the tool to be invoked as a GN
action.
- [parse.cc](parse.cc): Parser which produces a tree of `AstNode`s
corresponding to the input's derivation in the grammar.
- [sema.cc](sema.cc): "Semantic analysis" step (named for clang's semantic
analysis layer) which generates a table of `CppType`s. `CppType` represents
something that will become a C++ type in the final output.
- [codegen.cc](codegen.cc): C++ generation step which outputs struct, enum,
and function declarations to the specified header file and function
definitions to the specified source file.
### Grammar
Since CDDL is still an IETF draft spec and the grammar has changed at least a
few times, the grammar used for this implementation is duplicated in
[grammar.md](grammar.md).
|