File: url.md

package info (click to toggle)
glaze 7.0.2-4
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 9,036 kB
  • sloc: cpp: 142,035; sh: 109; ansic: 26; makefile: 12
file content (323 lines) | stat: -rw-r--r-- 8,599 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
# URL Utilities

Glaze provides URL encoding/decoding utilities for parsing query strings and form data. These utilities follow the [WHATWG URL Standard](https://url.spec.whatwg.org/) for `application/x-www-form-urlencoded` parsing.

## Header

```cpp
#include "glaze/net/url.hpp"
```

The URL utilities are also available when including `glaze/net/http.hpp` or `glaze/net/http_router.hpp`.

## URL Decoding

### Basic Usage

```cpp
// Decode percent-encoded strings
std::string decoded = glz::url_decode("hello%20world");  // "hello world"
std::string path = glz::url_decode("path%2Fto%2Ffile");  // "path/to/file"

// Plus signs are decoded as spaces (form encoding)
std::string query = glz::url_decode("search+term");  // "search term"
```

### Supported Encodings

| Encoded | Decoded |
|---------|---------|
| `%20` | space |
| `%2F` | `/` |
| `%3D` | `=` |
| `%26` | `&` |
| `%3F` | `?` |
| `%23` | `#` |
| `+` | space |

### Buffer Reuse (Zero-Allocation)

For high-performance scenarios, pass a reusable buffer to avoid allocations:

```cpp
std::string buffer;
buffer.reserve(1024);  // Pre-allocate once

// Decode multiple strings without allocation (if buffer has capacity)
glz::url_decode("hello%20world", buffer);
std::cout << buffer << std::endl;  // "hello world"

glz::url_decode("foo%2Fbar", buffer);
std::cout << buffer << std::endl;  // "foo/bar"
```

## URL Encoding

### Basic Usage

```cpp
// Encode strings for safe use in URLs
std::string encoded = glz::url_encode("hello world");     // "hello+world"
std::string path = glz::url_encode("path/to/file");       // "path%2Fto%2Ffile"
std::string special = glz::url_encode("a=b&c=d");         // "a%3Db%26c%3Dd"
```

### Encoding Rules

- **Unreserved characters** pass through unchanged: `A-Z`, `a-z`, `0-9`, `-`, `.`, `_`, `~`
- **Space** is encoded as `+` (per `application/x-www-form-urlencoded`)
- **All other characters** are percent-encoded (`%XX`)

### Building Query Strings

```cpp
// Build a safe query string
std::string query = "q=" + glz::url_encode(user_search) +
                    "&category=" + glz::url_encode(category);

// Example: user_search = "C++ templates", category = "programming/advanced"
// Result: "q=C%2B%2B+templates&category=programming%2Fadvanced"
```

### Roundtrip Encoding

Encode and decode are symmetric:

```cpp
std::string original = "Hello World! Special: /=&?#";
std::string encoded = glz::url_encode(original);
std::string decoded = glz::url_decode(encoded);

// decoded == original
```

### Buffer Reuse (Zero-Allocation)

```cpp
std::string buffer;
buffer.reserve(1024);

glz::url_encode("hello world", buffer);
std::cout << buffer << std::endl;  // "hello+world"

glz::url_encode("a/b", buffer);
std::cout << buffer << std::endl;  // "a%2Fb"
```

## Parsing URL-Encoded Data

The `parse_urlencoded` function parses `key=value&key2=value2` format used in:
- URL query strings (`?limit=10&offset=20`)
- Form POST bodies (`application/x-www-form-urlencoded`)

### Basic Usage

```cpp
// Parse query string
auto params = glz::parse_urlencoded("limit=10&offset=20&sort=name");

std::cout << params["limit"];   // "10"
std::cout << params["offset"];  // "20"
std::cout << params["sort"];    // "name"
```

### Automatic Decoding

Keys and values are automatically URL-decoded:

```cpp
auto params = glz::parse_urlencoded("name=John%20Doe&city=New+York");

std::cout << params["name"];  // "John Doe"
std::cout << params["city"];  // "New York"
```

### Edge Cases

```cpp
// Empty value
auto p1 = glz::parse_urlencoded("key=");
// p1["key"] == ""

// Key without value
auto p2 = glz::parse_urlencoded("flag");
// p2["flag"] == ""

// Duplicate keys (last value wins)
auto p3 = glz::parse_urlencoded("a=1&a=2&a=3");
// p3["a"] == "3"

// Empty string
auto p4 = glz::parse_urlencoded("");
// p4 is empty

// Empty keys are skipped
auto p5 = glz::parse_urlencoded("=value&a=1");
// p5.size() == 1, p5["a"] == "1"
```

### Buffer Reuse (Zero-Allocation)

For server applications processing many requests, reuse buffers to minimize allocations:

```cpp
// Reusable buffers - allocate once, reuse for all requests
std::unordered_map<std::string, std::string> params;
std::string key_buffer;
std::string value_buffer;

// Pre-reserve capacity
key_buffer.reserve(64);
value_buffer.reserve(256);

// Process multiple query strings without allocation
for (const auto& query : incoming_queries) {
    glz::parse_urlencoded(query, params, key_buffer, value_buffer);

    // Process params...
    handle_request(params);
}
```

There's also a two-argument overload that manages key/value buffers internally:

```cpp
std::unordered_map<std::string, std::string> params;

glz::parse_urlencoded("foo=bar&baz=qux", params);
// params is populated, internal buffers used for decoding
```

## Splitting URL Targets

The `split_target` function separates a URL path from its query string:

```cpp
auto [path, query] = glz::split_target("/api/users?limit=10&offset=20");
// path  == "/api/users"
// query == "limit=10&offset=20"

auto [path2, query2] = glz::split_target("/api/users");
// path2  == "/api/users"
// query2 == ""  (empty)
```

This is useful when you need to process the path and query string separately:

```cpp
std::string_view target = "/search?q=hello%20world&page=1";

auto [path, query_string] = glz::split_target(target);
auto params = glz::parse_urlencoded(query_string);

std::cout << "Path: " << path << std::endl;           // "/search"
std::cout << "Query: " << params["q"] << std::endl;   // "hello world"
std::cout << "Page: " << params["page"] << std::endl; // "1"
```

## Integration with HTTP Server

When using the Glaze HTTP server, query parameters are automatically parsed and available in the request object:

```cpp
server.get("/api/users", [](const glz::request& req, glz::response& res) {
    // Query parameters are automatically parsed
    // For request: GET /api/users?limit=10&offset=20

    if (auto it = req.query.find("limit"); it != req.query.end()) {
        int limit = std::stoi(it->second);  // 10
    }

    if (auto it = req.query.find("offset"); it != req.query.end()) {
        int offset = std::stoi(it->second);  // 20
    }

    // req.path contains just the path without query string
    // req.path == "/api/users"

    // req.target contains the full URL
    // req.target == "/api/users?limit=10&offset=20"
});
```

## Parsing Form POST Data

For `application/x-www-form-urlencoded` POST requests, use `parse_urlencoded` on the request body:

```cpp
server.post("/login", [](const glz::request& req, glz::response& res) {
    // Check content type
    auto ct = req.headers.find("content-type");
    if (ct == req.headers.end() ||
        ct->second.find("application/x-www-form-urlencoded") == std::string::npos) {
        res.status(415).json({{"error", "Unsupported content type"}});
        return;
    }

    // Parse form data from body
    auto form = glz::parse_urlencoded(req.body);

    std::string username = form["username"];
    std::string password = form["password"];

    // Authenticate...
});
```

## API Reference

### `url_decode`

```cpp
// Returns decoded string (allocates)
[[nodiscard]] std::string url_decode(std::string_view input);

// Writes to buffer (can avoid allocation if buffer has capacity)
void url_decode(std::string_view input, std::string& output);
```

### `url_encode`

```cpp
// Returns encoded string (allocates)
[[nodiscard]] std::string url_encode(std::string_view input);

// Writes to buffer (can avoid allocation if buffer has capacity)
void url_encode(std::string_view input, std::string& output);
```

### `parse_urlencoded`

```cpp
// Returns new map (allocates)
[[nodiscard]] std::unordered_map<std::string, std::string>
    parse_urlencoded(std::string_view query_string);

// Writes to provided map (reuses map capacity)
void parse_urlencoded(std::string_view query_string,
                      std::unordered_map<std::string, std::string>& output);

// Full buffer control (minimal allocations)
void parse_urlencoded(std::string_view query_string,
                      std::unordered_map<std::string, std::string>& output,
                      std::string& key_buffer,
                      std::string& value_buffer);
```

### `split_target`

```cpp
struct target_components {
    std::string_view path{};
    std::string_view query_string{};
};

constexpr target_components split_target(std::string_view target) noexcept;
```

### `hex_char_to_int`

```cpp
// Convert hex character to integer (0-15), returns -1 for invalid input
constexpr int hex_char_to_int(char c) noexcept;
```