1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219
|
# Security Considerations
Glaze is designed with security in mind, particularly for parsing untrusted data from network sources. This document describes the security measures in place and best practices for handling potentially malicious input.
## Binary Format Security (BEVE, MessagePack, CBOR)
### Memory Bomb (DoS) Protection
Binary formats like BEVE encode length headers that indicate how many elements or bytes follow. A malicious actor could craft a message with a huge length header (e.g., claiming 1 billion elements) but minimal actual data. Without proper protection, this could cause:
- **Memory exhaustion**: The parser calls `resize()` with the malicious size, consuming all available memory
- **Process termination**: The system's OOM killer terminates the process
- **Denial of Service**: The server becomes unresponsive
#### How Glaze Protects Against This
Glaze validates length headers against the remaining buffer size **before** any memory allocation. If a length header claims more data than exists in the buffer, parsing fails immediately with `error_code::invalid_length`.
```cpp
// Example: Malicious buffer claiming 1 billion strings
std::vector<std::string> result;
auto ec = glz::read_beve(result, malicious_buffer);
// ec.ec == glz::error_code::invalid_length
// No memory was allocated for the claimed 1 billion strings
```
This protection applies to:
- **Strings**: Length must not exceed remaining buffer bytes
- **Typed arrays** (numeric, boolean, string, complex): Element count validated against buffer size
- **Generic arrays**: Element count validated against buffer size
- **Maps/Objects**: Entry count validated against buffer size
#### Protection Guarantees
| Data Type | Validation |
|-----------|------------|
| `std::string` | Length ≤ remaining buffer bytes |
| `std::vector<T>` (numeric) | Count × sizeof(T) ≤ remaining buffer bytes |
| `std::vector<bool>` | (Count + 7) / 8 ≤ remaining buffer bytes |
| `std::vector<std::string>` | Count ≤ remaining buffer bytes (minimum 1 byte per string header) |
| Generic arrays | Count ≤ remaining buffer bytes (minimum 1 byte per element) |
#### 32-bit System Considerations
On 32-bit systems, BEVE length headers using 8-byte encoding (for values > 2^30) are rejected with `invalid_length` since these values cannot be addressed in 32-bit memory space.
### User-Configurable Allocation Limits
For applications that need stricter control over memory allocation, Glaze provides compile-time options to limit the maximum size of strings and arrays during parsing.
#### Global Limits via Custom Options
Apply limits to all strings/arrays in a parse operation:
```cpp
// Define custom options with allocation limits
struct secure_opts : glz::opts
{
uint32_t format = glz::BEVE;
size_t max_string_length = 1024; // Max 1KB per string
size_t max_array_size = 10000; // Max 10,000 elements per array
};
std::vector<std::string> data;
auto ec = glz::read<secure_opts{}>(data, buffer);
if (ec.ec == glz::error_code::invalid_length) {
// A string or array exceeded the configured limit
}
```
#### Per-Field Limits via Wrapper
Apply limits to specific fields using `glz::max_length`:
```cpp
struct UserInput
{
std::string username; // Should be limited
std::string bio; // Can be longer
std::vector<int> scores; // Should be limited
};
template <>
struct glz::meta<UserInput>
{
using T = UserInput;
static constexpr auto value = object(
"username", glz::max_length<&T::username, 64>, // Max 64 chars
"bio", &T::bio, // No limit
"scores", glz::max_length<&T::scores, 100> // Max 100 elements
);
};
```
These options work together with buffer-based validation:
1. **Buffer validation**: Rejects claims that exceed buffer capacity (always enabled)
2. **Allocation limits**: Rejects allocations that exceed user-defined limits (optional)
This allows you to accept legitimately large data while protecting against excessive memory usage.
### Best Practices for Network Applications
1. **Limit input buffer size**: Control the maximum message size your application accepts at the network layer, before passing data to Glaze.
```cpp
constexpr size_t MAX_MESSAGE_SIZE = 1024 * 1024; // 1 MB limit
void handle_message(const std::span<const std::byte> data) {
if (data.size() > MAX_MESSAGE_SIZE) {
// Reject oversized messages before parsing
return;
}
MyStruct obj;
auto ec = glz::read_beve(obj, data);
if (ec) {
// Handle parse error
}
}
```
2. **Use appropriate container types**: Consider using fixed-size containers like `std::array` when the maximum size is known at compile time.
3. **Validate after parsing**: Use `glz::read_constraint` for business logic validation after successful parsing.
```cpp
template <>
struct glz::meta<UserInput> {
using T = UserInput;
static constexpr auto value = object(
"username", glz::read_constraint<&T::username, [](const auto& s) {
return s.size() <= 64;
}>
);
};
```
### Raw Pointer Allocation Safety
By default, Glaze refuses to allocate memory for null raw pointers during deserialization. This prevents a class of memory leaks where the parser would call `new` without any mechanism to ensure the memory is freed.
```cpp
struct example { int x, y, z; };
std::vector<example*> vec;
auto ec = glz::read_beve(vec, buffer);
// ec.ec == glz::error_code::invalid_nullable_read (safe default)
```
If your application requires raw pointer allocation, you can enable it with `allocate_raw_pointers = true`:
```cpp
struct alloc_opts : glz::opts {
bool allocate_raw_pointers = true;
};
std::vector<example*> vec;
auto ec = glz::read<alloc_opts{}>(vec, buffer);
// Works, but you MUST manually delete allocated pointers
for (auto* p : vec) delete p;
```
> [!WARNING]
> When enabling `allocate_raw_pointers`, your application is responsible for tracking and freeing all allocated memory. Consider using smart pointers (`std::unique_ptr`, `std::shared_ptr`) instead, which Glaze handles automatically and safely.
## JSON Format Security
### Safe Parsing Defaults
- **No buffer overruns**: All parsing operations validate bounds before accessing data
- **No null pointer dereferences**: Null checks are performed where applicable
- **Integer overflow protection**: Numeric parsing handles overflow conditions
### Untrusted Input
When parsing JSON from untrusted sources:
1. **Limit recursion depth**: Deeply nested structures can cause stack overflow. Consider flattening data structures or implementing depth limits at the application level.
2. **Limit string sizes**: Very large strings can consume excessive memory. Control this through input buffer size limits.
3. **Handle unknown keys**: Use `error_on_unknown_keys = false` if you want to ignore unexpected fields, or keep it `true` (default) to reject messages with unknown structure.
```cpp
// Reject messages with unknown keys (default behavior)
constexpr glz::opts strict_opts{.error_on_unknown_keys = true};
// Or allow unknown keys to be ignored
constexpr glz::opts lenient_opts{.error_on_unknown_keys = false};
```
## Error Handling
Always check return values when parsing untrusted data:
```cpp
auto ec = glz::read_beve(obj, buffer);
if (ec) {
// Parsing failed - do not use obj
std::cerr << glz::format_error(ec, buffer) << '\n';
return;
}
// Safe to use obj
```
Error codes that may indicate malicious input:
| Error Code | Description |
|------------|-------------|
| `invalid_length` | Length exceeds allowed limit (buffer size or user-configured max) |
| `unexpected_end` | Buffer truncated during parsing |
| `syntax_error` | Invalid data structure or type mismatch |
| `parse_error` | Malformed data that doesn't match expected format |
## Reporting Security Issues
If you discover a security vulnerability in Glaze, please report it responsibly by opening an issue at [https://github.com/stephenberry/glaze/issues](https://github.com/stephenberry/glaze/issues).
|