1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557
|
<pre>Network Working Group J. Degener
Request for Comments: 5173 P. Guenther
Updates: <a href="./rfc5229">5229</a> Sendmail, Inc.
Category: Standards Track April 2008
<span class="h1">Sieve Email Filtering: Body Extension</span>
Status of This Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Abstract
This document defines a new command for the "Sieve" email filtering
language that tests for the occurrence of one or more strings in the
body of an email message.
<span class="grey">Degener & Guenther Standards Track [Page 1]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-2" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>. Introduction</span>
The "body" test checks for the occurrence of one or more strings in
the body of an email message. Such a test was initially discussed
for the [<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>] base document, but was subsequently removed because
it was thought to be too costly to implement.
Nevertheless, several server vendors have implemented some form of
the "body" test.
This document reintroduces the "body" test as an extension, and
specifies its syntax and semantics.
<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>. Conventions Used in This Document</span>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [<a href="#ref-KEYWORDS" title=""Key words for use in RFCs to Indicate Requirement Levels"">KEYWORDS</a>].
Conventions for notations are as in [<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>] <a href="#section-1.1">Section 1.1</a>, including
the use of the "Usage:" label for the definition of text and tagged
argument syntax.
The rules for interpreting the grammar are defined in [<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>] and
inherited by this specification. In particular, readers of this
document are reminded that according to [<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>] Sections <a href="#section-2.6.2">2.6.2</a> and
2.6.3, optional arguments such as COMPARATOR and MATCH-TYPE can
appear in any order.
<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>. Capability Identifier</span>
The capability string associated with the extension defined in this
document is "body".
<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>. Test body</span>
Usage: "body" [COMPARATOR] [MATCH-TYPE] [BODY-TRANSFORM]
<key-list: string-list>
The body test matches content in the body of an email message, that
is, anything following the first empty line after the header. (The
empty line itself, if present, is not considered to be part of the
body.)
The COMPARATOR and MATCH-TYPE keyword parameters are defined in
[<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>]. As specified in Sections <a href="#section-2.7.1">2.7.1</a> and <a href="#section-2.7.3">2.7.3</a> of [<a href="#ref-SIEVE" title=""Sieve: An Email Filtering Language"">SIEVE</a>], the
default COMPARATOR is "i;ascii-casemap" and the default MATCH-TYPE is
":is".
<span class="grey">Degener & Guenther Standards Track [Page 2]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-3" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
The BODY-TRANSFORM is a keyword parameter that governs how a set of
strings to be matched against are extracted from the body of the
message. If a message consists of a header only, not followed by an
empty line, then that set is empty and all "body" tests return false,
including those that test for an empty string. (This is similar to
how the "header" test always fails when the named header fields
aren't present.) Otherwise, the transform must be followed as
defined below in <a href="#section-5">Section 5</a>.
Note that the transformations defined here do *not* match against
each line of the message independently, so the strings will usually
contain CRLFs. How these can be matched is governed by the
comparator and match-type. For example, with the default comparator
of "i;ascii-casemap", they can be included literally in the key
strings, or be matched with the "*" or "?" wildcards of the :matches
match-type, or be skipped with :contains.
<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>. Body Transform</span>
Prior to matching content in a message body, "transformations" can be
applied that filter and decode certain parts of the body. These
transformations are selected by a "BODY-TRANSFORM" keyword parameter.
Usage: ":raw"
/ ":content" <content-types: string-list>
/ ":text"
The default transformation is :text.
<span class="h3"><a class="selflink" id="section-5.1" href="#section-5.1">5.1</a>. Body Transform ":raw"</span>
The ":raw" transform matches against the entire undecoded body of a
message as a single item.
If the specified body-transform is ":raw", the [<a href="#ref-MIME" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">MIME</a>] structure of
the body is irrelevant. The implementation MUST NOT remove any
transfer encoding from the message, MUST NOT refuse to filter
messages with syntactic errors (unless the environment it is part of
rejects them outright), and MUST treat multipart boundaries or the
MIME headers of enclosed body parts as part of the content being
matched against, instead of MIME structures to interpret.
<span class="grey">Degener & Guenther Standards Track [Page 3]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-4" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
Example:
require "body";
# This will match a message containing the literal text
# "MAKE MONEY FAST" in body parts (ignoring any
# content-transfer-encodings) or MIME headers other than
# the outermost <a href="./rfc2822">RFC 2822</a> header.
if body :raw :contains "MAKE MONEY FAST" {
discard;
}
<span class="h3"><a class="selflink" id="section-5.2" href="#section-5.2">5.2</a>. Body Transform ":content"</span>
If the body transform is ":content", the MIME parts that have the
specified content types are matched against independently.
If an individual content type begins or ends with a '/' (slash) or
contains multiple slashes, then it matches no content types.
Otherwise, if it contains a slash, then it specifies a full
<type>/<subtype> pair, and matches only that specific content type.
If it is the empty string, all MIME content types are matched.
Otherwise, it specifies a <type> only, and any subtype of that type
matches it.
The search for MIME parts matching the :content specification is
recursive and automatically descends into multipart and
message/rfc822 MIME parts. All MIME parts with matching types are
searched for the key strings. The test returns true if any
combination of a searched MIME part and key-list argument match.
If the :content specification matches a multipart MIME part, only the
prologue and epilogue sections of the part will be searched for the
key strings, treating the entire prologue and the entire epilogue as
separate strings; the contents of nested parts are only searched if
their respective types match the :content specification.
If the :content specification matches a message/rfc822 MIME part,
only the header of the nested message will be searched for the key
strings, treating the header as a single string; the contents of the
nested message body parts are only searched if their content type
matches the :content specification.
For other MIME types, the entire part will be searched as a single
string.
<span class="grey">Degener & Guenther Standards Track [Page 4]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-5" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
(Matches against container types with an empty match string can be
useful as tests for the existence of such parts.)
Example:
From: Whomever
To: Someone
Date: Whenever
Subject: whatever
Content-Type: multipart/mixed; boundary=outer
& This is a multi-part message in MIME format.
&
--outer
Content-Type: multipart/alternative; boundary=inner
& This is a nested multi-part message in MIME format.
&
--inner
Content-Type: text/plain; charset="us-ascii"
$ Hello
$
--inner
Content-Type: text/html; charset="us-ascii"
% <html><body>Hello</body></html>
%
--inner--
&
& This is the end of the inner MIME multipart.
&
--outer
Content-Type: message/rfc822
! From: Someone Else
! Subject: hello request
$ Please say Hello
$
--outer--
&
& This is the end of the outer MIME multipart.
<span class="grey">Degener & Guenther Standards Track [Page 5]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-6" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
In the above example, the '&', '$', '%', and '!' characters at the
start of a line are used to illustrate what portions of the example
message are used in tests:
- the lines starting with '&' are the ones that are tested when a
'body :content "multipart" :contains "MIME"' test is executed.
- the lines starting with '$' are the ones that are tested when a
'body :content "text/plain" :contains "Hello"' test is executed.
- the lines starting with '%' are the ones that are tested when a
'body :content "text/html" :contains "Hello"' test is executed.
- the lines starting with '$' or '%' are the ones that are tested
when a 'body :content "text" :contains "Hello"' test is executed.
- the lines starting with '!' are the ones that are tested when a
'body :content "message/rfc822" :contains "Hello"' test is
executed.
Comparisons are performed on octets. Implementations decode the
content-transfer-encoding and convert text to [<a href="#ref-UTF-8" title=""UTF-8, a transformation format of ISO 10646"">UTF-8</a>] as input to the
comparator. MIME parts that cannot be decoded and converted MAY be
treated as plain US-ASCII, omitted, or processed according to local
conventions. A NUL octet (character zero) SHOULD NOT cause early
termination of the content being compared against. Implementations
MUST support the "quoted-printable", "base64", "7bit", "8bit", and
"binary" content transfer encodings. Implementations MUST be capable
of converting to UTF-8 the US-ASCII, ISO-8859-1, and the US-ASCII
subset of ISO-8859-* character sets.
Each matched part is matched against independently: search
expressions MUST NOT match across MIME part boundaries. MIME headers
of the containing part MUST NOT be included in the data.
<span class="grey">Degener & Guenther Standards Track [Page 6]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-7" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
Example:
require ["body", "fileinto"];
# Save any message with any text MIME part that contains the
# words "missile" or "coordinates" in the "secrets" folder.
if body :content "text" :contains ["missile", "coordinates"] {
fileinto "secrets";
}
# Save any message with an audio/mp3 MIME part in
# the "jukebox" folder.
if body :content "audio/mp3" :contains "" {
fileinto "jukebox";
}
<span class="h3"><a class="selflink" id="section-5.3" href="#section-5.3">5.3</a>. Body Transform ":text"</span>
The ":text" body transform matches against the results of an
implementation's best effort at extracting UTF-8 encoded text from a
message.
It is unspecified whether this transformation results in a single
string or multiple strings being matched against. All the text
extracted from a given non-container MIME part MUST be in the same
string.
In simple implementations, :text MAY be treated the same as :content
"text".
Sophisticated implementations MAY strip mark-up from the text prior
to matching, and MAY convert media types other than text to text
prior to matching.
(For example, they may be able to convert proprietary text editor
formats to text or apply optical character recognition algorithms to
image data.)
Example:
require ["body", "fileinto"];
# Save messages mentioning the project schedule in the
# project/schedule folder.
if body :text :contains "project schedule" {
fileinto "project/schedule";
}
<span class="grey">Degener & Guenther Standards Track [Page 7]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-8" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>. Interaction with Other Sieve Extensions</span>
Any extension that extends the grammar for the COMPARATOR or MATCH-
TYPE nonterminals will also affect the implementation of "body".
Wildcard expressions used with "body" are exempt from the side
effects described in [<a href="#ref-VARIABLES" title=""Sieve Email Filtering: Variables Extension"">VARIABLES</a>]. That is, they MUST NOT set match
variables (${1}, ${2}...) to the input values corresponding to
wildcard sequences in the matched pattern. However, if the extension
is present, variable references in the key strings or content type
strings are evaluated as described in this document.
<span class="h2"><a class="selflink" id="section-7" href="#section-7">7</a>. IANA Considerations</span>
The following template specifies the IANA registration of the Sieve
extension specified in this document:
To: iana@iana.org
Subject: Registration of new Sieve extension
Capability name: body
Description: Provides a test for matching against the
body of the message being processed
RFC number: <a href="./rfc5173">RFC 5173</a>
Contact Address: The Sieve discussion list
<ietf-mta-filters@imc.org>
<span class="h2"><a class="selflink" id="section-8" href="#section-8">8</a>. Security Considerations</span>
The system MUST be sized and restricted in such a manner that even
malicious use of body matching does not deny service to other users
of the host system.
Filters relying on string matches in the raw body of an email message
may be more general than intended. Text matches are no replacement
for a spam, virus, or other security related filtering system.
<span class="h2"><a class="selflink" id="section-9" href="#section-9">9</a>. Acknowledgments</span>
This document has been revised in part based on comments and
discussions that took place on and off the SIEVE mailing list.
Thanks to Cyrus Daboo, Ned Freed, Bob Johannessen, Simon Josefsson,
Mark E. Mallett, Chris Markle, Alexey Melnikov, Ken Murchison, Greg
Shapiro, Tim Showalter, Nigel Swinson, Dowson Tong, and Christian
Vogt for reviews and suggestions.
<span class="grey">Degener & Guenther Standards Track [Page 8]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-9" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
<span class="h2"><a class="selflink" id="section-10" href="#section-10">10</a>. References</span>
<span class="h3"><a class="selflink" id="section-10.1" href="#section-10.1">10.1</a>. Normative References</span>
[<a id="ref-KEYWORDS">KEYWORDS</a>] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", <a href="https://www.rfc-editor.org/bcp/bcp14">BCP 14</a>, <a href="./rfc2119">RFC 2119</a>, March 1997.
[<a id="ref-MIME">MIME</a>] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", <a href="./rfc2045">RFC 2045</a>, November 1996.
[<a id="ref-SIEVE">SIEVE</a>] Guenther, P., Ed., and T. Showalter, Ed., "Sieve: An
Email Filtering Language", <a href="./rfc5228">RFC 5228</a>, January 2008.
[<a id="ref-UTF-8">UTF-8</a>] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, <a href="./rfc3629">RFC 3629</a>, November 2003.
<span class="h3"><a class="selflink" id="section-10.2" href="#section-10.2">10.2</a>. Informative References</span>
[<a id="ref-VARIABLES">VARIABLES</a>] Homme, K., "Sieve Email Filtering: Variables Extension",
<a href="./rfc5229">RFC 5229</a>, January 2008.
Authors' Addresses
Jutta Degener
5245 College Ave, Suite #127
Oakland, CA 94618
EMail: jutta@pobox.com
Philip Guenther
Sendmail, Inc.
6425 Christie Ave, 4th Floor
Emeryville, CA 94608
EMail: guenther@sendmail.com
<span class="grey">Degener & Guenther Standards Track [Page 9]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-10" ></span>
<span class="grey"><a href="./rfc5173">RFC 5173</a> Sieve Email Filtering: Body Extension April 2008</span>
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a>, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and <a href="https://www.rfc-editor.org/bcp/bcp79">BCP 79</a>.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
<a href="http://www.ietf.org/ipr">http://www.ietf.org/ipr</a>.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Degener & Guenther Standards Track [Page 10]
</pre>
|