File: security.md

package info (click to toggle)
glaze 6.5.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 7,948 kB
  • sloc: cpp: 121,839; sh: 99; ansic: 26; makefile: 13
file content (219 lines) | stat: -rw-r--r-- 8,125 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
# Security Considerations

Glaze is designed with security in mind, particularly for parsing untrusted data from network sources. This document describes the security measures in place and best practices for handling potentially malicious input.

## Binary Format Security (BEVE, MessagePack, CBOR)

### Memory Bomb (DoS) Protection

Binary formats like BEVE encode length headers that indicate how many elements or bytes follow. A malicious actor could craft a message with a huge length header (e.g., claiming 1 billion elements) but minimal actual data. Without proper protection, this could cause:

- **Memory exhaustion**: The parser calls `resize()` with the malicious size, consuming all available memory
- **Process termination**: The system's OOM killer terminates the process
- **Denial of Service**: The server becomes unresponsive

#### How Glaze Protects Against This

Glaze validates length headers against the remaining buffer size **before** any memory allocation. If a length header claims more data than exists in the buffer, parsing fails immediately with `error_code::invalid_length`.

```cpp
// Example: Malicious buffer claiming 1 billion strings
std::vector<std::string> result;
auto ec = glz::read_beve(result, malicious_buffer);
// ec.ec == glz::error_code::invalid_length
// No memory was allocated for the claimed 1 billion strings
```

This protection applies to:

- **Strings**: Length must not exceed remaining buffer bytes
- **Typed arrays** (numeric, boolean, string, complex): Element count validated against buffer size
- **Generic arrays**: Element count validated against buffer size
- **Maps/Objects**: Entry count validated against buffer size

#### Protection Guarantees

| Data Type | Validation |
|-----------|------------|
| `std::string` | Length ≤ remaining buffer bytes |
| `std::vector<T>` (numeric) | Count × sizeof(T) ≤ remaining buffer bytes |
| `std::vector<bool>` | (Count + 7) / 8 ≤ remaining buffer bytes |
| `std::vector<std::string>` | Count ≤ remaining buffer bytes (minimum 1 byte per string header) |
| Generic arrays | Count ≤ remaining buffer bytes (minimum 1 byte per element) |

#### 32-bit System Considerations

On 32-bit systems, BEVE length headers using 8-byte encoding (for values > 2^30) are rejected with `invalid_length` since these values cannot be addressed in 32-bit memory space.

### User-Configurable Allocation Limits

For applications that need stricter control over memory allocation, Glaze provides compile-time options to limit the maximum size of strings and arrays during parsing.

#### Global Limits via Custom Options

Apply limits to all strings/arrays in a parse operation:

```cpp
// Define custom options with allocation limits
struct secure_opts : glz::opts
{
   uint32_t format = glz::BEVE;
   size_t max_string_length = 1024;    // Max 1KB per string
   size_t max_array_size = 10000;      // Max 10,000 elements per array
};

std::vector<std::string> data;
auto ec = glz::read<secure_opts{}>(data, buffer);
if (ec.ec == glz::error_code::invalid_length) {
    // A string or array exceeded the configured limit
}
```

#### Per-Field Limits via Wrapper

Apply limits to specific fields using `glz::max_length`:

```cpp
struct UserInput
{
   std::string username;       // Should be limited
   std::string bio;            // Can be longer
   std::vector<int> scores;    // Should be limited
};

template <>
struct glz::meta<UserInput>
{
   using T = UserInput;
   static constexpr auto value = object(
      "username", glz::max_length<&T::username, 64>,   // Max 64 chars
      "bio", &T::bio,                                   // No limit
      "scores", glz::max_length<&T::scores, 100>       // Max 100 elements
   );
};
```

These options work together with buffer-based validation:

1. **Buffer validation**: Rejects claims that exceed buffer capacity (always enabled)
2. **Allocation limits**: Rejects allocations that exceed user-defined limits (optional)

This allows you to accept legitimately large data while protecting against excessive memory usage.

### Best Practices for Network Applications

1. **Limit input buffer size**: Control the maximum message size your application accepts at the network layer, before passing data to Glaze.

```cpp
constexpr size_t MAX_MESSAGE_SIZE = 1024 * 1024; // 1 MB limit

void handle_message(const std::span<const std::byte> data) {
    if (data.size() > MAX_MESSAGE_SIZE) {
        // Reject oversized messages before parsing
        return;
    }

    MyStruct obj;
    auto ec = glz::read_beve(obj, data);
    if (ec) {
        // Handle parse error
    }
}
```

2. **Use appropriate container types**: Consider using fixed-size containers like `std::array` when the maximum size is known at compile time.

3. **Validate after parsing**: Use `glz::read_constraint` for business logic validation after successful parsing.

```cpp
template <>
struct glz::meta<UserInput> {
    using T = UserInput;
    static constexpr auto value = object(
        "username", glz::read_constraint<&T::username, [](const auto& s) {
            return s.size() <= 64;
        }>
    );
};
```

### Raw Pointer Allocation Safety

By default, Glaze refuses to allocate memory for null raw pointers during deserialization. This prevents a class of memory leaks where the parser would call `new` without any mechanism to ensure the memory is freed.

```cpp
struct example { int x, y, z; };

std::vector<example*> vec;
auto ec = glz::read_beve(vec, buffer);
// ec.ec == glz::error_code::invalid_nullable_read (safe default)
```

If your application requires raw pointer allocation, you can enable it with `allocate_raw_pointers = true`:

```cpp
struct alloc_opts : glz::opts {
   bool allocate_raw_pointers = true;
};

std::vector<example*> vec;
auto ec = glz::read<alloc_opts{}>(vec, buffer);
// Works, but you MUST manually delete allocated pointers
for (auto* p : vec) delete p;
```

> [!WARNING]
> When enabling `allocate_raw_pointers`, your application is responsible for tracking and freeing all allocated memory. Consider using smart pointers (`std::unique_ptr`, `std::shared_ptr`) instead, which Glaze handles automatically and safely.

## JSON Format Security

### Safe Parsing Defaults

- **No buffer overruns**: All parsing operations validate bounds before accessing data
- **No null pointer dereferences**: Null checks are performed where applicable
- **Integer overflow protection**: Numeric parsing handles overflow conditions

### Untrusted Input

When parsing JSON from untrusted sources:

1. **Limit recursion depth**: Deeply nested structures can cause stack overflow. Consider flattening data structures or implementing depth limits at the application level.

2. **Limit string sizes**: Very large strings can consume excessive memory. Control this through input buffer size limits.

3. **Handle unknown keys**: Use `error_on_unknown_keys = false` if you want to ignore unexpected fields, or keep it `true` (default) to reject messages with unknown structure.

```cpp
// Reject messages with unknown keys (default behavior)
constexpr glz::opts strict_opts{.error_on_unknown_keys = true};

// Or allow unknown keys to be ignored
constexpr glz::opts lenient_opts{.error_on_unknown_keys = false};
```

## Error Handling

Always check return values when parsing untrusted data:

```cpp
auto ec = glz::read_beve(obj, buffer);
if (ec) {
    // Parsing failed - do not use obj
    std::cerr << glz::format_error(ec, buffer) << '\n';
    return;
}
// Safe to use obj
```

Error codes that may indicate malicious input:

| Error Code | Description |
|------------|-------------|
| `invalid_length` | Length exceeds allowed limit (buffer size or user-configured max) |
| `unexpected_end` | Buffer truncated during parsing |
| `syntax_error` | Invalid data structure or type mismatch |
| `parse_error` | Malformed data that doesn't match expected format |

## Reporting Security Issues

If you discover a security vulnerability in Glaze, please report it responsibly by opening an issue at [https://github.com/stephenberry/glaze/issues](https://github.com/stephenberry/glaze/issues).