File: README.md

package info (click to toggle)
lua-lpeg-patterns 0.5-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 220 kB
  • sloc: makefile: 6
file content (342 lines) | stat: -rw-r--r-- 11,591 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
A collection of [LPEG](http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html) patterns

## Use cases

  - Strict validation of user input
  - Searching free-form input


## Modules

### `core`

A small module implementing commonly used rules from [RFC-5234 appendix B.1](https://tools.ietf.org/html/rfc5234#appendix-B.1)

  - `ALPHA` (pattern)
  - `BIT` (pattern)
  - `CHAR` (pattern)
  - `CR` (pattern)
  - `CRLF` (pattern)
  - `CTL` (pattern)
  - `DIGIT` (pattern)
  - `DQUOTE` (pattern)
  - `HEXDIG` (pattern)
  - `HTAB` (pattern)
  - `LF` (pattern)
  - `LWSP` (pattern)
  - `OCTET` (pattern)
  - `SP` (pattern)
  - `VCHAR` (pattern)
  - `WSP` (pattern)


### `IPv4`

  - `IPv4address` (pattern): parses an IPv4 address in dotted decimal notation. on success, returns addresses as an IPv4 object
  - `IPv4_methods` (table):
      - `unpack` (function): the IPv4 address as a series of 4 8 bit numbers
      - `binary` (function): the IPv4 address as a 4 byte binary string
  - `IPv4_mt` (table): metatable given to IPv4 objects
      - `__index` (table): `IPv4_methods`
      - `__tostring` (function): returns the IPv4 address in dotted decimal notation

IPv4 "dotted decimal notation" in this document refers to "strict" form (see [RFC-6943 section 3.1.1](https://tools.ietf.org/html/rfc6943#section-3.1.1)) unless otherwise noted.


### `IPv6`

  - `IPv6address` (pattern): parses an IPv6 address
  - `IPv6addrz` (pattern): parses an IPv6 address with optional "ZoneID" (see [RFC-6874](https://tools.ietf.org/html/rfc6874))
  - `IPv6_methods` (table): methods available on IPv6 objects
      - `unpack` (function): the IPv6 address as a series of 8 16bit numbers, optionally followed by zoneid
      - `binary` (function): the IPv6 address as a 16 byte binary string
      - `setzoneid` (function): set the zoneid of this IPv6 address
  - `IPv6_mt` (table): metatable given to IPv6 objects
      - `__tostring` (function): will return the IPv6 address as a valid IPv6 string


### `uri`

Parses URIs as described in [RFC-3986](https://tools.ietf.org/html/rfc3986).

  - `uri` (pattern): on success, returns a table with fields: (similar to [luasocket](http://w3.impa.br/~diego/software/luasocket/url.html#parse))
      - `scheme`
      - `userinfo`
      - `host`
      - `port`
      - `path`
      - `query`
      - `fragment`
  - `absolute_uri` (pattern): similar to `uri`, but does not permit fragments
  - `uri_reference` (pattern): similar to `uri`, but permits relative URIs
  - `relative_part` (pattern): matches a relative uri not including query and fragment; data is held in named group captures `"userinfo"`, `"host"`, `"port"`, `"path"`
  - `scheme` (pattern): matches the scheme portion of a URI
  - `userinfo` (pattern): matches the userinfo portion of a URI
  - `host` (pattern): matches the host portion of a URI
  - `IP_literal` (pattern): matches an IP based host portion of a URI. Capture is an [IPv4](#IPv4), [IPv6](#IPv6) or IPvFuture object
  - `port` (pattern): matches the port portion of a URI
  - `authority` (pattern): matches the authority portion of a URI; data is held in named group captures of `"userinfo"`, `"host"`, `"port"`
  - `path` (pattern): matches the path portion of a URI. Captures `nil` for the empty path.
  - `segment` (pattern): matches a path segment (a piece of a path without a `/`)
  - `query` (pattern): matches the query portion of a URI
  - `fragment` (pattern): matches the fragment portion of a URI
  - `sane_uri` (pattern): a variant that shouldn't match things that people would not normally consider URIs.
    e.g. uris without a hostname
  - `sane_host` (pattern): a variant that shouldn't match things that people would not normally consider valid hosts.
  - `sane_authority` (pattern): a variant that shouldn't match things that people would not normally consider valid hosts.
  - `pct_encoded` (pattern): matches a percent encoded octet, produces a capture of the normalised form.
  - `sub_delims` (pattern): the set of subcomponent delimeters


### `email`

  - `mailbox` (pattern): the mailbox format: matches either `name_addr` or an addr-spec.
  - `name_addr` (pattern): the name and address format i.e. `Display Name<email@example.com>`
    Has captures of the local_part and the domain. Captures the display name in the named capture `"display"`
  - `email` (pattern): also known as an "addr-spec"; follows [RFC-5322 section 3.4.1](http://tools.ietf.org/html/rfc5322#section-3.4.1)
    Has captures of the local_part and the domain
    Be careful trying to reconstruct the email address from the captures; you may need escaping
  - `local_part` (pattern): the bit before the `@` in an email address
  - `domain` (pattern): the bit after the `@` in an email address
  - `email_nocfws` (pattern): a variant that doesn't allow for comments or folding whitespace
  - `local_part_nocfws` (pattern): the bit before the `@` in an email address; no comments or folding whitespace allowed.
  - `domain_nocfws` (pattern):  the bit after the `@` in an email address; no comments or folding whitespace allowed.


### `http`

These patterns should be considered to have non stable APIs.

#### [RFC 4918](https://tools.ietf.org/html/rfc4918)

  - `DAV` (pattern)
  - `Depth` (pattern)
  - `Destination` (pattern)
  - `If` (pattern)
  - `Lock_Token` (pattern)
  - `Overwrite` (pattern)
  - `TimeOut` (pattern)


#### [RFC 5023](https://tools.ietf.org/html/rfc5023)

  - `SLUG` (pattern)


#### [RFC 5323](https://tools.ietf.org/html/rfc5323)

  - `DASL` (pattern)


#### [RFC 5789](https://tools.ietf.org/html/rfc5789)

  - `Accept_Patch` (pattern)


#### [RFC 5988](https://tools.ietf.org/html/rfc5988)

  - `Link` (pattern)


#### [RFC 6265](https://tools.ietf.org/html/rfc6265)

  - `Set_Cookie` (pattern)
  - `Cookie` (pattern)


#### [RFC 6266](https://tools.ietf.org/html/rfc6266)

  - `Content_Disposition` (pattern)


#### [RFC 6454](https://tools.ietf.org/html/rfc6454)

  - `Origin` (pattern)


#### [RFC 6455](https://tools.ietf.org/html/rfc6455)

  - `Sec_WebSocket_Accept` (pattern)
  - `Sec_WebSocket_Key` (pattern)
  - `Sec_WebSocket_Extensions` (pattern)
  - `Sec_WebSocket_Protocol_Client` (pattern)
  - `Sec_WebSocket_Protocol_Server` (pattern)
  - `Sec_WebSocket_Version_Client` (pattern)
  - `Sec_WebSocket_Version_Server` (pattern)


#### [RFC 6638](https://tools.ietf.org/html/rfc6638)

  - `Schedule_Reply` (pattern)
  - `Schedule_Tag` (pattern)
  - `If_Schedule_Tag_Match` (pattern)


#### [RFC 6797](https://tools.ietf.org/html/rfc6797)

  - `Strict_Transport_Security` (pattern)


#### [RFC 7034](https://tools.ietf.org/html/rfc7034)

  - `X_Frame_Options` (pattern)


#### [RFC 7089](https://tools.ietf.org/html/rfc7089)

  - `Accept_Datetime` (pattern)
  - `Memento_Datetime` (pattern)


#### [RFC 7230](https://tools.ietf.org/html/rfc7230)

  - `request_line` (pattern)
  - `field_name` (pattern)
  - `field_value` (pattern)
  - `header_field` (pattern)
  - `OWS` (pattern)
  - `RWS` (pattern)
  - `BWS` (pattern)
  - `token` (pattern)
  - `qdtext` (pattern)
  - `quoted_string` (pattern)
  - `comment` (pattern)
  - `Content_Length` (pattern)
  - `Transfer_Encoding` (pattern)
  - `chunk_ext` (pattern)
  - `TE` (pattern)
  - `Trailer` (pattern)
  - `request_target` (pattern)
  - `Host` (pattern)
  - `Via` (pattern): captures are a list of tables with fields `.protocol`, `.by` and `.comment`
  - `Connection` (pattern)
  - `Upgrade` (pattern): captures are a list of strings containing *protocol* or *protocol/version*


#### [RFC 7231](https://tools.ietf.org/html/rfc7231)

  - `IMF_fixdate` (pattern)
  - `Content_Encoding` (pattern)
  - `Content_Type` (pattern)
  - `Content_Language` (pattern)
  - `Content_Location` (pattern)
  - `Expect` (pattern)
  - `Max_Forwards` (pattern)
  - `Accept` (pattern)
  - `Accept_Charset` (pattern)
  - `Accept_Encoding` (pattern)
  - `Accept_Language` (pattern)
  - `From` (pattern)
  - `Referer` (pattern)
  - `User_Agent` (pattern)
  - `Date` (pattern): capture is a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)
  - `Location` (pattern)
  - `Retry_After` (pattern): capture is either a table describing an absolute time in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time), or a relative time as a number of seconds
  - `Vary` (pattern)
  - `Allow` (pattern)
  - `Server` (pattern)


#### [RFC 7232](https://tools.ietf.org/html/rfc7232)

  - `Last_Modified` (pattern): capture is a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)
  - `ETag` (pattern)
  - `If_Match` (pattern)
  - `If_None_Match` (pattern)
  - `If_Modified_Since` (pattern): capture is a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)
  - `If_Unmodified_Since` (pattern): capture is a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)


#### [RFC 7233](https://tools.ietf.org/html/rfc7233)

  - `Accept_Ranges` (pattern)
  - `Range` (pattern)
  - `If_Range` (pattern): capture is either an `entity_tag` or a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)
  - `Content_Range` (pattern)


#### [RFC 7234](https://tools.ietf.org/html/rfc7234)

  - `Age` (pattern)
  - `Cache_Control` (pattern): captures are grouped into key/value pairs (where a directive with no value has a value of `true`)
  - `Expires` (pattern): capture is a table in the same format as used by [`os.time`](http://www.lua.org/manual/5.3/manual.html#pdf-os.time)
  - `Pragma` (pattern)
  - `Warning` (pattern)


#### [RFC 7235](https://tools.ietf.org/html/rfc7235)

  - `WWW_Authenticate` (pattern)
  - `Authorization` (pattern)
  - `Proxy_Authenticate` (pattern)
  - `Proxy_Authorization` (pattern)


#### [RFC 7239](https://tools.ietf.org/html/rfc7239)

  - `Forwarded` (pattern)


#### [RFC 7469](https://tools.ietf.org/html/rfc7469)

  - `Public_Key_Pins` (pattern)
  - `Public_Key_Pins_Report_Only` (pattern)


#### [RFC 7486](https://tools.ietf.org/html/rfc7486)

  - `Hobareg` (pattern)


#### [RFC 7615](https://tools.ietf.org/html/rfc7615)

  - `Authentication_Info` (pattern)
  - `Proxy_Authentication_Info` (pattern)


#### [RFC 7639](https://tools.ietf.org/html/rfc7639)

  - `ALPN` (pattern)


#### [RFC 7809](https://tools.ietf.org/html/rfc7809)

  - `CalDAV_Timezones` (pattern)


#### [RFC 7838](https://tools.ietf.org/html/rfc7838)

  - `Alt_Svc` (pattern)
  - `Alt_Used` (pattern)


#### [Expect-CT Extension for HTTP](https://tools.ietf.org/html/draft-ietf-httpbis-expect-ct-06)

  - `Expect_CT` (pattern)


#### [Referrer-Policy header](https://www.w3.org/TR/referrer-policy/#referrer-policy-header)

  - `Referrer_Policy` (pattern)


### `phone`

  - `phone` (pattern): includes detailed checking for:
      - USA phone numbers using the [NANP](https://en.wikipedia.org/wiki/North_American_Numbering_Plan)


### `language`

Patterns for definitions from [RFC-4646 Section 2.1](https://tools.ietf.org/html/rfc4646#section-2.1)

  - `langtag` (pattern): Capture is a table with the language tag decomposed into components:
      - `language`
      - `extlang` (optional)
      - `script` (optional)
      - `region` (optional)
      - `variant` (optional): an array
      - `extension` (optional): a dictionary from singleton to value
      - `privateuse` (optional): an array
  - `privateuse` (pattern): captures an array
  - `Language_Tag` (pattern): captures the whole language tag