1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781
|
<pre>Network Working Group Y. Abel, Ed.
Request for Comments: 5335 TWNIC
Updates: <a href="./rfc2045">2045</a>, <a href="./rfc2822">2822</a> September 2008
Category: Experimental
<span class="h1">Internationalized Email Headers</span>
Status of This Memo
This memo defines an Experimental Protocol for the Internet
community. It does not specify an Internet standard of any kind.
Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Abstract
Full internationalization of electronic mail requires not only the
capabilities to transmit non-ASCII content, to encode selected
information in specific header fields, and to use non-ASCII
characters in envelope addresses. It also requires being able to
express those addresses and the information based on them in mail
header fields. This document specifies an experimental variant of
Internet mail that permits the use of Unicode encoded in UTF-8,
rather than ASCII, as the base form for Internet email header field.
This form is permitted in transmission only if authorized by an SMTP
extension, as specified in an associated specification. This
specification Updates <a href="./rfc2045#section-6.4">section 6.4 of RFC 2045</a> to conform with the
requirements.
<span class="grey">Abel Experimental [Page 1]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-2" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
Table of Contents
<a href="#section-1">1</a>. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-1.1">1.1</a>. Role of This Specification . . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-1.2">1.2</a>. Relation to Other Standards . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-2">2</a>. Background and History . . . . . . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-3">3</a>. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-4">4</a>
<a href="#section-4">4</a>. Changes on Message Header Fields . . . . . . . . . . . . . . . <a href="#page-5">5</a>
<a href="#section-4.1">4.1</a>. UTF-8 Syntax and Normalization . . . . . . . . . . . . . . <a href="#page-5">5</a>
<a href="#section-4.2">4.2</a>. Changes on MIME Headers . . . . . . . . . . . . . . . . . <a href="#page-6">6</a>
<a href="#section-4.3">4.3</a>. Syntax Extensions to <a href="./rfc2822">RFC 2822</a> . . . . . . . . . . . . . . <a href="#page-6">6</a>
<a href="#section-4.4">4.4</a>. Change on addr-spec Syntax . . . . . . . . . . . . . . . . <a href="#page-8">8</a>
<a href="#section-4.5">4.5</a>. Trace Field Syntax . . . . . . . . . . . . . . . . . . . . <a href="#page-9">9</a>
<a href="#section-4.6">4.6</a>. message/global . . . . . . . . . . . . . . . . . . . . . . <a href="#page-9">9</a>
<a href="#section-5">5</a>. Security Considerations . . . . . . . . . . . . . . . . . . . <a href="#page-11">11</a>
<a href="#section-6">6</a>. IANA Considerations . . . . . . . . . . . . . . . . . . . . . <a href="#page-12">12</a>
<a href="#section-7">7</a>. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-12">12</a>
<a href="#section-8">8</a>. References . . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-12">12</a>
<a href="#section-8.1">8.1</a>. Normative References . . . . . . . . . . . . . . . . . . . <a href="#page-12">12</a>
<a href="#section-8.2">8.2</a>. Informative References . . . . . . . . . . . . . . . . . . <a href="#page-13">13</a>
<span class="grey">Abel Experimental [Page 2]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-3" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>. Introduction</span>
<span class="h3"><a class="selflink" id="section-1.1" href="#section-1.1">1.1</a>. Role of This Specification</span>
Full internationalization of electronic mail requires several
capabilities:
o The capability to transmit non-ASCII content, provided for as part
of the basic MIME specification [<a href="./rfc2045" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">RFC2045</a>], [<a href="./rfc2046" title=""Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types"">RFC2046</a>].
o The capability to use international characters in envelope
addresses, discussed in [<a href="./rfc4952" title=""Overview and Framework for Internationalized Email"">RFC4952</a>] and specified in [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>].
o The capability to express those addresses, and information related
to them and based on them, in mail header fields, defined in this
document.
This document specifies an experimental variant of Internet mail that
permits the use of Unicode encoded in UTF-8 [<a href="./rfc3629" title=""UTF-8, a transformation format of ISO 10646"">RFC3629</a>], rather than
ASCII, as the base form for Internet email header fields. This form
is permitted in transmission, if authorized by the SMTP extension
specified in [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>] or by other transport mechanisms capable of
processing it.
<span class="h3"><a class="selflink" id="section-1.2" href="#section-1.2">1.2</a>. Relation to Other Standards</span>
This document updates <a href="./rfc2045#section-6.4">Section 6.4 of RFC 2045</a>. It removes the
blanket ban on applying a content-transfer-encoding to all subtypes
of message/, and instead specifies that a composite subtype MAY
specify whether or not a content-transfer-encoding can be used for
that subtype, with "cannot be used" as the default.
This document also updates [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] and MIME ([<a href="./rfc2045" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">RFC2045</a>]), and the
fact that an Experimental specification updates a Standards-Track
specification means that people who participate in the experiment
have to consider those standards updated.
Allowing use of a content-transfer-encoding on subtypes of messages
is not limited to transmissions that are authorized by the SMTP
extension specified in [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>]. Message/global permits use of a
content-transfer-encoding.
<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>. Background and History</span>
Mailbox names often represent the names of human users. Many of
these users throughout the world have names that are not normally
expressed with just the ASCII repertoire of characters, and would
like to use more or less their real names in their mailbox names.
<span class="grey">Abel Experimental [Page 3]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-4" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
These users are also likely to use non-ASCII text in their common
names and subjects of email messages, both received and sent. This
protocol specifies UTF-8 as the encoding to represent email header
field bodies.
The traditional format of email messages [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] allows only ASCII
characters in the header fields of messages. This prevents users
from having email addresses that contain non-ASCII characters. It
further forces non-ASCII text in common names, comments, and in free
text (such as in the Subject: field) to be encoded (as required by
MIME format [<a href="./rfc2047" title=""MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text"">RFC2047</a>]). This specification describes a change to the
email message format that is related to the SMTP message transport
change described in the associated document [<a href="./rfc4952" title=""Overview and Framework for Internationalized Email"">RFC4952</a>] and [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>],
and that allows non-ASCII characters in most email header fields.
These changes affect SMTP clients, SMTP servers, mail user agents
(MUAs), list expanders, gateways to other media, and all other
processes that parse or handle email messages.
As specified in [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>], an SMTP protocol extension "UTF8SMTP" is
used to prevent the transmission of messages with UTF-8 header fields
to systems that cannot handle such messages.
Use of this SMTP extension helps prevent the introduction of such
messages into message stores that might misinterpret, improperly
display, or mangle such messages. It should be noted that using an
ESMTP extension does not prevent transferring email messages with
UTF-8 header fields to other systems that use the email format for
messages and that may not be upgraded, such as unextended POP and
IMAP servers. Changes to these protocols to handle UTF-8 header
fields are addressed in [<a href="#ref-EAI-POP" title=""POP3 Support for UTF-8"">EAI-POP</a>] and [<a href="#ref-IMAP-UTF8" title=""IMAP Support for UTF-8"">IMAP-UTF8</a>] .
The objective for this protocol is to allow UTF-8 in email header
fields. Issues such as how to handle messages containing UTF-8
header fields that have to be delivered to systems that have not been
upgraded to support this capability are discussed in [<a href="#ref-DOWNGRADE" title=""Downgrading mechanism for Email Address Internationalization"">DOWNGRADE</a>].
<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>. Terminology</span>
A plain ASCII string is also a valid UTF-8 string; see [<a href="./rfc3629" title=""UTF-8, a transformation format of ISO 10646"">RFC3629</a>]. In
this document, ordinary ASCII characters are UTF-8 characters if they
are in headers which contain <utf8-xtra-char>s.
Unless otherwise noted, all terms used here are defined in [<a href="./rfc2821" title=""Simple Mail Transfer Protocol"">RFC2821</a>],
[<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>], [<a href="./rfc4952" title=""Overview and Framework for Internationalized Email"">RFC4952</a>], or [<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [<a href="./rfc2119" title=""Key words for use in RFCs to Indicate Requirement Levels"">RFC2119</a>].
<span class="grey">Abel Experimental [Page 4]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-5" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>. Changes on Message Header Fields</span>
SMTP clients can send header fields in UTF-8 format, if the UTF8SMTP
extension is advertised by the SMTP server or is permitted by other
transport mechanisms.
This protocol does NOT change the [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] rules for defining header
field names. The bodies of header fields are allowed to contain
UTF-8 characters, but the header field names themselves must contain
only ASCII characters.
To permit UTF-8 characters in field values, the header definition in
[<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] must be extended to support the new format. The following
ABNF is defined to substitute those definitions in [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>].
The syntax rules not covered in this section remain as defined in
[<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>].
<span class="h3"><a class="selflink" id="section-4.1" href="#section-4.1">4.1</a>. UTF-8 Syntax and Normalization</span>
UTF-8 characters can be defined in terms of octets using the
following ABNF [<a href="./rfc5234" title=""Augmented BNF for Syntax Specifications: ABNF"">RFC5234</a>], taken from [<a href="./rfc3629" title=""UTF-8, a transformation format of ISO 10646"">RFC3629</a>]:
UTF8-xtra-char = UTF8-2 / UTF8-3 / UTF8-4
UTF8-2 = %xC2-DF UTF8-tail
UTF8-3 = %xE0 %xA0-BF UTF8-tail /
%xE1-EC 2(UTF8-tail) /
%xED %x80-9F UTF8-tail /
%xEE-EF 2(UTF8-tail)
UTF8-4 = %xF0 %x90-BF 2( UTF8-tail ) /
%xF1-F3 3( UTF8-tail ) /
%xF4 %x80-8F 2( UTF8-tail )
UTF8-tail = %x80-BF
These are normatively defined in [<a href="./rfc3629" title=""UTF-8, a transformation format of ISO 10646"">RFC3629</a>], but kept in this document
for reasons of convenience.
See [<a href="./rfc5198" title=""Unicode Format for Network Interchange"">RFC5198</a>] for a discussion of normalization; the use of
normalization form NFC is RECOMMENDED.
<span class="grey">Abel Experimental [Page 5]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-6" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
<span class="h3"><a class="selflink" id="section-4.2" href="#section-4.2">4.2</a>. Changes on MIME Headers</span>
This specification updates <a href="./rfc2045#section-6.4">Section 6.4 of [RFC2045]</a>. [<a href="./rfc2045" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">RFC2045</a>]
prohibits applying a content-transfer-encoding to all subtypes of
message/. This specification relaxes the rule -- it allows newly
defined MIME types to permit content-transfer-encoding, and it allows
content-transfer-encoding for message/global (see <a href="#section-4.6">Section 4.6</a>).
Background: Normally, transfer of message/global will be done in
8-bit-clean channels, and body parts will have "identity" encodings,
that is, no decoding is necessary. In the case where a message
containing a message/global is downgraded from 8-bit to 7-bit as
described in [<a href="./rfc1652" title=""SMTP Service Extension for 8bit- MIMEtransport"">RFC1652</a>], an encoding may be applied to the message; if
the message travels multiple times between a 7-bit environment and an
environment implementing UTF8SMTP, multiple levels of encoding may
occur. This is expected to be rarely seen in practice, and the
potential complexity of other ways of dealing with the issue are
thought to be larger than the complexity of allowing nested encodings
where necessary.
<span class="h3"><a class="selflink" id="section-4.3" href="#section-4.3">4.3</a>. Syntax Extensions to <a href="./rfc2822">RFC 2822</a></span>
The following rules are intended to extend the corresponding rules in
[<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] in order to allow UTF-8 characters.
FWS = <see [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>], folding white space>
CFWS = <see [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>], folding white space>
ctext =/ UTF8-xtra-char
utext =/ UTF8-xtra-char
comment = "(" *([FWS] utf8-ccontent) [FWS] ")"
word = utf8-atom / utf8-quoted-string
This means that all the [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>] constructs that build upon these
will permit UTF-8 characters, including comments and quoted strings.
We do not change the syntax of <atext> in order to allow UTF8
characters in <addr-spec>. This would also allow UTF-8 characters in
<message-id>, which is not allowed due to the limitation described in
<a href="#section-4.5">Section 4.5</a>. Instead, <utf8-atext> is added to meet this
requirement.
<span class="grey">Abel Experimental [Page 6]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-7" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
utf8-text = %d1-9 / ; all UTF-8 characters except
%d11-12 / ; US-ASCII NUL, CR, and LF
%d14-127 /
UTF8-xtra-char
utf8-quoted-pair = ("\" utf8-text) / obs-qp
utf8-qcontent = utf8-qtext / utf8-quoted-pair
utf8-quoted-string = [CFWS]
DQUOTE *([FWS] utf8-qcontent) [FWS] DQUOTE
[CFWS]
utf8-ccontent = ctext / utf8-quoted-pair / comment
utf8-qtext = qtext / UTF8-xtra-char
utf8-atext = ALPHA / DIGIT /
"!" / "#" / ; Any character except
"$" / "%" / ; controls, SP, and specials.
"&" / "'" / ; Used for atoms.
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~" /
UTF8-xtra-char
utf8-atom = [CFWS] 1*utf8-atext [CFWS]
utf8-dot-atom = [CFWS] utf8-dot-atom-text [CFWS]
utf8-dot-atom-text = 1*utf8-atext *("." 1*utf8-atext)
qcontent = utf8-qcontent
To allow the use of UTF-8 in a Content-Description header field
[<a href="./rfc2045" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">RFC2045</a>], the following syntax is used:
description = "Content-Description:" unstructured CRLF
The <utext> syntax is extended above to allow UTF-8 in all
<unstructured> header fields.
<span class="grey">Abel Experimental [Page 7]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-8" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
Note, however, this does not remove any constraint on the character
set of protocol elements; for instance, all the allowed values for
timezone in the Date: headers are still expressed in ASCII. And
also, none of this revised syntax changes what is allowed in a
<msg-id>, which will still remain in pure ASCII.
<span class="h3"><a class="selflink" id="section-4.4" href="#section-4.4">4.4</a>. Change on addr-spec Syntax</span>
Internationalized email addresses are represented in UTF-8. Thus,
all header fields containing <mailbox>es are updated to permit UTF-8
as well as an additional, optional all-ASCII alternate address. Note
that Message Submission Servers ("MSAs") and Message Transfer Agents
(MTAs) may downgrade internationalized messages as needed. The
procedure for doing so is described in [<a href="#ref-DOWNGRADE" title=""Downgrading mechanism for Email Address Internationalization"">DOWNGRADE</a>].
mailbox = name-addr / addr-spec / utf8-addr-spec
angle-addr =/ [CFWS] "<" utf8-addr-spec [ alt-address ] ">"
[CFWS] / obs-angle-addr
utf8-addr-spec = utf8-local-part "@" utf8-domain
utf8-local-part= utf8-dot-atom / utf8-quoted-string / obs-local-part
utf8-domain = utf8-dot-atom / domain-literal / obs-domain
alt-address = FWS "<" addr-spec ">"
Below are a few examples of possible <mailbox> representations.
"DISPLAY_NAME" <ASCII@ASCII>
; traditional mailbox format
"DISPLAY_NAME" <non-ASCII@non-ASCII>
; UTF8SMTP but no ALT-ADDRESS parameter provided,
; message will bounce if UTF8SMTP extension is not supported
<non-ASCII@non-ASCII>
; without DISPLAY_NAME and quoted string
; UTF8SMTP but no ALT-ADDRESS parameter provided,
; message will bounce if UTF8SMTP extension is not supported
"DISPLAY_NAME" <non-ASCII@non-ASCII <ASCII@ASCII>>
; UTF8SMTP with ALT-ADDRESS parameter provided,
; ALT-ADDRESS can be used if downgrade is necessary
<span class="grey">Abel Experimental [Page 8]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-9" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
<span class="h3"><a class="selflink" id="section-4.5" href="#section-4.5">4.5</a>. Trace Field Syntax</span>
"For" fields containing internationalized addresses are allowed, by
use of the new uFor syntax. UTF-8 information may be needed in
Received fields. Such information is therefore allowed to preserve
the integrity of those fields. The uFor syntax retains the original
UTF-8 email address between email address internationalization (EAI)-
aware MTAs. Note that, should downgrading be required, the uFor
parameter is dropped per the procedure specified in [<a href="#ref-DOWNGRADE" title=""Downgrading mechanism for Email Address Internationalization"">DOWNGRADE</a>].
The "Return-Path" header provides the email return address in the
mail delivery. Thus, the header is augmented to carry UTF-8
addresses (see the revised syntax of <angle-addr> in <a href="#section-4.4">Section 4.4</a> of
this document). This will not break the rule of trace field
integrity, because the header is added at the last MTA and described
in [<a href="./rfc2821" title=""Simple Mail Transfer Protocol"">RFC2821</a>].
The <item-value> on "Received:" syntax is augmented to allow UTF-8
email address in the "For" field. <angle-addr> is augmented to
include UTF-8 email address. In order to allow UTF-8 email addresses
in an <addr-spec>, <utf8-addr-spec> is added to <item-value>.
item-value =/ utf8-addr-spec
<span class="h3"><a class="selflink" id="section-4.6" href="#section-4.6">4.6</a>. message/global</span>
Internationalized messages must only be transmitted as authorized by
[<a href="./rfc5336" title=""SMTP Extension for Internationalized Email Addresses"">RFC5336</a>] or within a non-SMTP environment which supports these
messages. A message is a "message/global message", if
o it contains UTF-8 header values as specified in this document, or
o it contains UTF-8 values in the headers fields of body parts.
The type message/global is similar to message/rfc822, except that it
contains a message that can contain UTF-8 characters in the headers
of the message or body parts. If this type is sent to a 7-bit-only
system, it has to be encoded in MIME [<a href="./rfc2045" title=""Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies"">RFC2045</a>]. (Note that a system
compliant with MIME that doesn't recognize message/global would treat
it as "application/octet-stream" as described in <a href="./rfc2046#section-5.2.4">Section 5.2.4 of
[RFC2046]</a>.)
Alternatively, SMTP servers and other systems which transfer a
message/global body part MAY choose to down-convert it to a message/
<a href="./rfc822">rfc822</a> body part using the rules described in [<a href="#ref-DOWNGRADE" title=""Downgrading mechanism for Email Address Internationalization"">DOWNGRADE</a>].
<span class="grey">Abel Experimental [Page 9]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-10" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
Type name: message
Subtype name: global
Required parameters: none
Optional parameters: none
Encoding considerations: Any content-transfer-encoding is permitted.
The 8-bit or binary content-transfer-encodings are recommended
where permitted.
Security considerations: See <a href="#section-5">Section 5</a>.
Interoperability considerations: The media type provides
functionality similar to the message/rfc822 content type for email
messages with international email headers. When there is a need
to embed or return such content in another message, there is
generally an option to use this media type and leave the content
unchanged or down-convert the content to message/rfc822. Both of
these choices will interoperate with the installed base, but with
different properties. Systems unaware of international headers
will typically treat a message/global body part as an unknown
attachment, while they will understand the structure of a message/
<a href="./rfc822">rfc822</a>. However, systems that understand message/global will
provide functionality superior to the result of a down-conversion
to message/rfc822. The most interoperable choice depends on the
deployed software.
Published specification: <a href="./rfc5335">RFC 5335</a>
Applications that use this media type: SMTP servers and email
clients that support multipart/report generation or parsing.
Email clients which forward messages with international headers as
attachments.
Additional information:
Magic number(s): none
File extension(s): The extension ".u8msg" is suggested.
Macintosh file type code(s): A uniform type identifier (UTI) of
"public.utf8-email-message" is suggested. This conforms to
"public.message" and "public.composite-content", but does not
necessarily conform to "public.utf8-plain-text".
<span class="grey">Abel Experimental [Page 10]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-11" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
Person & email address to contact for further information: See the
Author's Address section of this document.
Intended usage: COMMON
Restrictions on usage: This is a structured media type which embeds
other MIME media types. The 8-bit or binary content-transfer-
encoding MUST be used unless this media type is sent over a 7-bit-
only transport.
Author: See the Author's Address section of this document.
Change controller: IETF Standards Process
<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>. Security Considerations</span>
If a user has a non-ASCII mailbox address and an ASCII mailbox
address, a digital certificate that identifies that user may have
both addresses in the identity. Having multiple email addresses as
identities in a single certificate is already supported in PKIX
(Public Key Infrastructure for X.509 Certificates) and OpenPGP.
Because UTF-8 often requires several octets to encode a single
character, internationalized local parts may cause mail addresses to
become longer. As specified in [<a href="./rfc2822" title=""Internet Message Format"">RFC2822</a>], each line of characters
MUST be no more 998 octets, excluding the CRLF.
Because internationalized local parts may cause email addresses to be
longer, processes that parse, store, or handle email addresses or
local parts must take extra care not to overflow buffers, truncate
addresses, or exceed storage allotments. Also, they must take care,
when comparing, to use the entire lengths of the addresses.
In this specification, a user could provide an ASCII alternative
address for a non-ASCII address. However, it is possible these two
addresses go to different mailboxes, or even different people. This
configuration may be based on a user's personal choice or on
administration policy. We recognize that if ASCII and non-ASCII
email is delivered to two different destinations, based on MTA
capability, this may violate the principle of least astonishment, but
this is not a "protocol problem".
The security impact of UTF-8 headers on email signature systems such
as Domain Keys Identified Mail (DKIM), S/MIME, and OpenPGP is
discussed in <a href="./rfc4952#section-9">RFC 4952, Section 9</a>. A subsequent document [<a href="#ref-DOWNGRADE" title=""Downgrading mechanism for Email Address Internationalization"">DOWNGRADE</a>]
will cover the impact of downgrading on these systems.
<span class="grey">Abel Experimental [Page 11]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-12" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>. IANA Considerations</span>
IANA has registered the message/global MIME type using the
registration form contained in <a href="#section-4.4">Section 4.4</a>.
<span class="h2"><a class="selflink" id="section-7" href="#section-7">7</a>. Acknowledgements</span>
This document incorporates many ideas first described in Internet-
Draft form by Paul Hoffman, although many details have changed from
that earlier work.
The author especially thanks Jeff Yeh for his efforts and
contributions on editing previous versions.
Most of the content of this document is provided by John C Klensin.
Also, some significant comments and suggestions were received from
Charles H. Lindsey, Kari Hurtta, Pete Resnick, Alexey Melnikov, Chris
Newman, Yangwoo Ko, Yoshiro Yoneya, and other members of the JET team
(Joint Engineering Team) and were incorporated into the document.
The editor sincerely thanks them for their contributions.
<span class="h2"><a class="selflink" id="section-8" href="#section-8">8</a>. References</span>
<span class="h3"><a class="selflink" id="section-8.1" href="#section-8.1">8.1</a>. Normative References</span>
[<a id="ref-RFC1652">RFC1652</a>] Klensin, J., Freed, N., Rose, M., Stefferud, E., and D.
Crocker, "SMTP Service Extension for 8bit-
MIMEtransport", <a href="./rfc1652">RFC 1652</a>, July 1994.
[<a id="ref-RFC2119">RFC2119</a>] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", <a href="https://www.rfc-editor.org/bcp/bcp14">BCP 14</a>, <a href="./rfc2119">RFC 2119</a>, March 1997.
[<a id="ref-RFC2821">RFC2821</a>] Klensin, J., "Simple Mail Transfer Protocol", <a href="./rfc2821">RFC 2821</a>,
April 2001.
[<a id="ref-RFC2822">RFC2822</a>] Resnick, P., "Internet Message Format", <a href="./rfc2822">RFC 2822</a>,
April 2001.
[<a id="ref-RFC3629">RFC3629</a>] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, <a href="./rfc3629">RFC 3629</a>, November 2003.
[<a id="ref-RFC4952">RFC4952</a>] Klensin, J. and Y. Ko, "Overview and Framework for
Internationalized Email", <a href="./rfc4952">RFC 4952</a>, July 2007.
[<a id="ref-RFC5198">RFC5198</a>] Klensin, J. and M. Padlipsky, "Unicode Format for
Network Interchange", <a href="./rfc5198">RFC 5198</a>, March 2008.
<span class="grey">Abel Experimental [Page 12]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-13" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
[<a id="ref-RFC5234">RFC5234</a>] Crocker, D. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, <a href="./rfc5234">RFC 5234</a>, January 2008.
[<a id="ref-RFC5336">RFC5336</a>] Yao, J., Ed. and W. Mao, Ed., "SMTP Extension for
Internationalized Email Addresses", <a href="./rfc5336">RFC 5336</a>,
September 2008.
<span class="h3"><a class="selflink" id="section-8.2" href="#section-8.2">8.2</a>. Informative References</span>
[<a id="ref-DOWNGRADE">DOWNGRADE</a>] Fujiwara, K. and Y. Yoneya, "Downgrading mechanism for
Email Address Internationalization", Work in Progress,
July 2008.
[<a id="ref-EAI-POP">EAI-POP</a>] Newman, C. and R. Gellens, <a style="text-decoration: none" href='https://www.google.com/search?sitesearch=datatracker.ietf.org%2Fdoc%2Fhtml%2F&q=inurl:draft-+%22POP3+Support+for+UTF-8%22'>"POP3 Support for UTF-8"</a>,
Work in Progress, July 2008.
[<a id="ref-IMAP-UTF8">IMAP-UTF8</a>] Resnick, P. and C. Newman, <a style="text-decoration: none" href='https://www.google.com/search?sitesearch=datatracker.ietf.org%2Fdoc%2Fhtml%2F&q=inurl:draft-+%22IMAP+Support+for+UTF-8%22'>"IMAP Support for UTF-8"</a>,
Work in Progress, April 2008.
[<a id="ref-RFC2045">RFC2045</a>] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message
Bodies", <a href="./rfc2045">RFC 2045</a>, November 1996.
[<a id="ref-RFC2046">RFC2046</a>] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", <a href="./rfc2046">RFC 2046</a>,
November 1996.
[<a id="ref-RFC2047">RFC2047</a>] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
Part Three: Message Header Extensions for Non-ASCII
Text", <a href="./rfc2047">RFC 2047</a>, November 1996.
Author's Address
Abel Yang (editor)
TWNIC
4F-2, No. 9, Sec 2, Roosvelt Rd.
Taipei, 100
Taiwan
Phone: +886 2 23411313 ext 505
EMail: abelyang@twnic.net.tw
<span class="grey">Abel Experimental [Page 13]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-14" ></span>
<span class="grey"><a href="./rfc5335">RFC 5335</a> I18N Email Headers September 2008</span>
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a>, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and <a href="https://www.rfc-editor.org/bcp/bcp79">BCP 79</a>.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
<a href="http://www.ietf.org/ipr">http://www.ietf.org/ipr</a>.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Abel Experimental [Page 14]
</pre>
|