File: draft-murchison-sieve-regex-07.txt

package info (click to toggle)
dovecot 1%3A1.2.15-7
  • links: PTS, VCS
  • area: main
  • in suites: squeeze
  • size: 30,252 kB
  • ctags: 19,837
  • sloc: ansic: 191,438; sh: 21,091; makefile: 3,330; cpp: 526; perl: 108; xml: 44
file content (451 lines) | stat: -rw-r--r-- 15,623 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451







Internet Draft                                               K. Murchison
Category: Standards Track                              Oceana Matrix Ltd.
Expires: May 23, 2004                                    18 November 2003


            Sieve Email Filtering -- Regular Expression Extension

                    <draft-murchison-sieve-regex-07.txt>

Status of this Memo

    This document is an Internet-Draft and is subject to all provisions
    of Section 10 of RFC2026.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups.  Note that
    other groups may also distribute working documents as
    Internet-Drafts.

    Internet-Drafts are draft documents valid for a maximum of six months
    and may be updated, replaced, or obsoleted by other documents at any
    time.  It is inappropriate to use Internet-Drafts as reference
    material or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/1id-abstracts.html

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html

Copyright Notice

    Copyright (C) The Internet Society (2003). All Rights Reserved.


Abstract

    In some cases, it is desirable to have a string matching mechanism
    which is more powerful than a simple exact match, a substring match
    or a glob-style wildcard match.  The regular expression matching
    mechanism defined in this draft should allow users to isolate just
    about any string or address in a message header or envelope.









Expires: May 23, 2004         Murchison                         [Page 1]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


                           Table of Contents



0.     Meta-information on this draft  . . . . . . . . . . . . . . .   3

0.1.   Discussion  . . . . . . . . . . . . . . . . . . . . . . . . .   3

0.2.   Noted Changes Since -06 . . . . . . . . . . . . . . . . . . .   3

0.3.   Open Issues . . . . . . . . . . . . . . . . . . . . . . . . .   3

1.     Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4

2.     Capability Identifier . . . . . . . . . . . . . . . . . . . .   4

3.     Regex Match Type  . . . . . . . . . . . . . . . . . . . . . .   4

4.     Security Considerations . . . . . . . . . . . . . . . . . . .   6

5.     IANA Considerations . . . . . . . . . . . . . . . . . . . . .   6

6.     Normative References  . . . . . . . . . . . . . . . . . . . .   7

7.     Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   7

8.     Intellectual Property Statement . . . . . . . . . . . . . . .   7

9.     Author's Address  . . . . . . . . . . . . . . . . . . . . . .   8

10.    Full Copyright Statement  . . . . . . . . . . . . . . . . . .   8




















Expires: May 23, 2004         Murchison                         [Page 2]

Internet Draft          Sieve -- Regex Extension       November 18, 2004



0.     Meta-information on this draft

    This information is intended to facilitate discussion.  It will be
    removed when this document leaves the Internet-Draft stage.


0.1.   Discussion

    This draft is intended to be an extension to the Sieve mail filtering
    language, available from the RFC repository as
    <ftp://ftp.ietf.org/rfc/rfc3028.txt>.

    This draft and the Sieve language itself are being discussed on the
    MTA Filters mailing list at <ietf-mta-filters@imc.org>.  Subscription
    requests can be sent to <ietf-mta-filters-request@imc.org> (send an
    email message with the word "subscribe" in the body).  More
    information on the mailing list along with a WWW archive of back
    messages is available at <http://www.imc.org/ietf-mta-filters/>.


0.2.   Noted Changes Since -06

    Added more open issues.

    Added IANA considerations.

    Editorial changes.


0.3.   Open Issues

    The major open issue with this draft is what to do, if anything,
    about localization/internationalization.  Are [POSIX.2] collating
    sequences and character equivalents sufficient?  Should we reference
    the unicode technical specification?  Should we punt and publish the
    document as experimental?

    Should we allow shorthands such as \b (word boundary) and \w (word
    character)?

    Should we allow backreferences (useful for matching double words,
    etc.)?

    Should we integrate with variables, so that $1, $2, ...  correspond
    to the first, second, ... groups within the regex?





Expires: May 23, 2004         Murchison                         [Page 3]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


1.  Introduction

    This is an extension to the Sieve language defined by [SIEVE] for
    comparing strings to regular expressions.

    Conventions for notations are as in [SIEVE] section 1.1, including
    use of [KEYWORDS].


2.  Capability Identifier

    The capability string associated with the extension defined in this
    document is "regex".


3.  Regex Match Type

    Commands that support matching may take the optional tagged argument
    ":regex" to specify that a regular expression match should be
    performed.  The ":regex" match type is subject to the same rules and
    restrictions as the standard match types defined in [SIEVE].  For
    convenience, the "MATCH-TYPE" syntax element defined in [SIEVE] is
    augmented here as follows:

         MATCH-TYPE  =/  ":regex"

    Example:

          require "regex";

          # Try to catch unsolicited email.
          if anyof (
            # if a message is not to me (with optional +detail),
            not address :regex ["to", "cc", "bcc"]
              "me(\\+.*)?@company\\.com",

            # or the subject is all uppercase (no lowercase)
            header :regex :comparator "i;octet" "subject"
              "^[^[:lower:]]+$" ) {

            discard;     # junk it
          }

    The ":regex" match type is compatible with both the "i;octet" and
    "i;ascii-casemap" comparators and may be used with them.

    Implementations MUST support extended regular expressions (EREs) as
    defined by [POSIX.2].  Any regular expression not defined by



Expires: May 23, 2004         Murchison                         [Page 4]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


    [POSIX.2], as well as [POSIX.2] basic regular expressions, word
    boundaries and backreferences are not supported by this extension.
    Implementations SHOULD reject regular expressions that are
    unsupported by this specification as a syntax error.

    The following table provides a brief summary of the regular
    expressions that MUST be supported.  This table is presented here
    only as a guideline.  [POSIX.2] should be used as the definitive
    reference.


    +------------+-----------------------------------------------------+
    | Expression |  Pattern                                            |
    +------------+-----------------------------------------------------+
    |                Items to match a single character                 |
    +------------+-----------------------------------------------------+
    |     .      |  Match any single character except newline.         |
    |    [ ]     |  Bracket expression.  Match any one of the enclosed |
    |            |  characters.  A hypen (-) indicates a range of      |
    |            |  consecutive characters.                            |
    |   [^  ]    |  Negated bracket expression.  Match any one         |
    |            |  character NOT in the enclosed list.  A hypen (-)   |
    |            |  indicates a range of consecutive characters.       |
    |    \\      |  Escape the following special character (match      |
    |            |  the literal character).  Undefined for other       |
    |            |  characters.                                        |
    |            |  NOTE: Unlike [POSIX.2], a double-backslash is      |
    |            |  required as per section 2.4.2 of [SIEVE].          |
    +------------+-----------------------------------------------------+
    |   Items to be used within a bracket expression (localization)    |
    +------------+-----------------------------------------------------+
    |   [: :]    |  Character class (alnum, alpha, blank, cntrl,       |
    |            |  digit, graph, lower, print, punct, space,          |
    |            |  upper, xdigit).                                    |
    |   [= =]    |  Character equivalents.                             |
    |   [. .]    |  Collating sequence.                                |
    +------------+-----------------------------------------------------+
    |  Quantifiers - Items to count the preceding regular expression   |
    +------------+-----------------------------------------------------+
    |     ?      |  Match zero or one instances.                       |
    |     *      |  Match zero or more instances.                      |
    |     +      |  Match one or more instances.                       |
    |   {n,m}    |  Match any number of instances between              |
    |            |  n and m (inclusive).  {n} matches exactly n        |
    |            |  instances.  {n,} matches n or more instances.      |
    +------------+-----------------------------------------------------+





Expires: May 23, 2004         Murchison                         [Page 5]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


    +------------+-----------------------------------------------------+
    | Expression |  Pattern                                            |
    +------------+-----------------------------------------------------+
    |              Anchoring - Items to match positions                |
    +------------+-----------------------------------------------------+
    |     ^      |  Match the beginning of the line or string.         |
    |     $      |  Match the end of the line or string.               |
    +------------+-----------------------------------------------------+
    |                        Other constructs                          |
    +------------+-----------------------------------------------------+
    |     |      |  Alternation.  Match either of the separated        |
    |            |  regular expressions.                               |
    |    ( )     |  Group the enclosed regular expression(s).          |
    +------------+-----------------------------------------------------+


4.  Security Considerations

    Security considerations are discussed in [SIEVE].  It is believed
    that this extension doesn't introduce any additional security
    concerns.

    However, a poor implementation COULD introduce security problems
    ranging from degradation of performance to denial of service.  If an
    implementation uses a third-party regular expression library, that
    library should be checked for potentially problematic regular
    expressions, such as "(.*)*".


5.  IANA Considerations

    The following template specifies the IANA registration of the Sieve
    extension specified in this document:

    To: iana@iana.org
    Subject: Registration of new Sieve extension

    Capability name: regex
    Capability keyword: regex
    Capability arguments: N/A
    Standards Track/IESG-approved experimental RFC number: this RFC
    Person and email address to contact for further information:

    Kenneth Murchison
    ken@oceana.com

    This information should be added to the list of sieve extensions
    given on http://www.iana.org/assignments/sieve-extensions.



Expires: May 23, 2004         Murchison                         [Page 6]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


6.
    Normative References

     [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
         Requirement Levels", RFC 2119, March 1997.


     [SIEVE] Showalter, T., "Sieve: A Mail Filtering Language",
         RFC 3028, January 2001.


     [POSIX.2], "Portable Operating System Interface (POSIX). Part 2,
         Shell and utilities", National Institute of Standards and
         Technology (U.S.).



7.  Acknowledgments

    Thanks to Tim Showalter, Alexey Melnikov, Tony Hansen, Phil Pennock,
    Jutta Degener and Ned Freed for their help with this document.


8.  Intellectual Property Statement

    The IETF takes no position regarding the validity or scope of any
    intellectual property or other rights that might be claimed to per
    tain to the implementation or use of the technology described in
    this document or the extent to which any license under such rights
    might or might not be available; neither does it represent that it
    has made any effort to identify any such rights.  Information on the
    IETF's procedures with respect to rights in standards-track and
    standards-related documentation can be found in BCP-11.  Copies of
    claims of rights made available for publication and any assurances
    of licenses to be made available, or the result of an attempt made
    to obtain a general license or permission for the use of such pro
    prietary rights by implementors or users of this specification can
    be obtained from the IETF Secretariat.

    The IETF invites any interested party to bring to its attention any
    copyrights, patents or patent applications, or other proprietary
    rights which may cover technology that may be required to practice
    this standard.  Please address the information to the IETF Executive
    Director.







Expires: May 23, 2004         Murchison                         [Page 7]

Internet Draft          Sieve -- Regex Extension       November 18, 2004


9.  Author's Address

    Kenneth Murchison
    Oceana Matrix Ltd.
    21 Princeton Place
    Orchard Park, NY  14127

    Phone: (716) 662-8973

    EMail: ken@oceana.com


10.
    Full Copyright Statement

    Copyright (C) The Internet Society (2003). All Rights Reserved.

    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain it
    or assist in its implmentation may be prepared, copied, published and
    distributed, in whole or in part, without restriction of any kind,
    provided that the above copyright notice and this paragraph are
    included on all such copies and derivative works.  However, this
    document itself may not be modified in any way, such as by removing
    the copyright notice or references to the Internet Society or other
    Internet organizations, except as needed for the  purpose of
    developing Internet standards in which case the procedures for
    copyrights defined in the Internet Standards process must be followed,
    or as required to translate it into languages other than English.

    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.

    This document and the information contained herein is provided on an
    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
    ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
    INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
    INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.












Expires: May 23, 2004         Murchison                         [Page 8]