File: rfc6532.html

package info (click to toggle)
doc-rfc 20230121-1
  • links: PTS, VCS
  • area: non-free
  • in suites: bookworm, forky, sid, trixie
  • size: 1,609,944 kB
file content (613 lines) | stat: -rw-r--r-- 31,380 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
<pre>Internet Engineering Task Force (IETF)                           A. Yang
Request for Comments: 6532                                         TWNIC
Obsoletes: <a href="./rfc5335">5335</a>                                                S. Steele
Updates: <a href="./rfc2045">2045</a>                                                  Microsoft
Category: Standards Track                                       N. Freed
ISSN: 2070-1721                                                   Oracle
                                                           February 2012


                    <span class="h1">Internationalized Email Headers</span>

Abstract

   Internet mail was originally limited to 7-bit ASCII.  MIME added
   support for the use of 8-bit character sets in body parts, and also
   defined an encoded-word construct so other character sets could be
   used in certain header field values.  However, full
   internationalization of electronic mail requires additional
   enhancements to allow the use of Unicode, including characters
   outside the ASCII repertoire, in mail addresses as well as direct use
   of Unicode in header fields like "From:", "To:", and "Subject:",
   without requiring the use of complex encoded-word constructs.  This
   document specifies an enhancement to the Internet Message Format and
   to MIME that allows use of Unicode in mail addresses and most header
   field content.

   This specification updates <a href="./rfc2045#section-6.4">Section&nbsp;6.4 of RFC 2045</a> to eliminate the
   restriction prohibiting the use of non-identity content-transfer-
   encodings on subtypes of "message/".

Status of This Memo

   This is an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is available in <a href="./rfc5741#section-2">Section&nbsp;2 of RFC 5741</a>.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   <a href="http://www.rfc-editor.org/info/rfc6532">http://www.rfc-editor.org/info/rfc6532</a>.








<span class="grey">Yang, et al.                 Standards Track                    [Page 1]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-2" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (<a href="http://trustee.ietf.org/license-info">http://trustee.ietf.org/license-info</a>) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   <a href="#section-1">1</a>.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  <a href="#page-3">3</a>
   <a href="#section-2">2</a>.  Terminology Used in This Specification . . . . . . . . . . . .  <a href="#page-3">3</a>
   <a href="#section-3">3</a>.  Changes to Message Header Fields . . . . . . . . . . . . . . .  <a href="#page-4">4</a>
     <a href="#section-3.1">3.1</a>.  UTF-8 Syntax and Normalization . . . . . . . . . . . . . .  <a href="#page-4">4</a>
     <a href="#section-3.2">3.2</a>.  Syntax Extensions to <a href="./rfc5322">RFC 5322</a>  . . . . . . . . . . . . . .  <a href="#page-5">5</a>
     <a href="#section-3.3">3.3</a>.  Use of 8-bit UTF-8 in Message-IDs  . . . . . . . . . . . .  <a href="#page-5">5</a>
     <a href="#section-3.4">3.4</a>.  Effects on Line Length Limits  . . . . . . . . . . . . . .  <a href="#page-5">5</a>
     <a href="#section-3.5">3.5</a>.  Changes to MIME Message Type Encoding Restrictions . . . .  <a href="#page-6">6</a>
     <a href="#section-3.6">3.6</a>.  Use of MIME Encoded-Words  . . . . . . . . . . . . . . . .  <a href="#page-6">6</a>
     <a href="#section-3.7">3.7</a>.  The message/global Media Type  . . . . . . . . . . . . . .  <a href="#page-7">7</a>
   <a href="#section-4">4</a>.  Security Considerations  . . . . . . . . . . . . . . . . . . .  <a href="#page-8">8</a>
   <a href="#section-5">5</a>.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  <a href="#page-9">9</a>
   <a href="#section-6">6</a>.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  <a href="#page-9">9</a>
   <a href="#section-7">7</a>.  References . . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>
     <a href="#section-7.1">7.1</a>.  Normative References . . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>
     <a href="#section-7.2">7.2</a>.  Informative References . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>


















<span class="grey">Yang, et al.                 Standards Track                    [Page 2]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-3" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>.  Introduction</span>

   Internet mail distinguishes a message from its transport and further
   divides a message between a header and a body [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>].  Internet
   mail header field values contain a variety of strings that are
   intended to be user-visible.  The range of supported characters for
   these strings was originally limited to [<a href="#ref-ASCII" title="&quot;Coded Character Set -- 7-bit American Standard Code for Information Interchange&quot;">ASCII</a>] in 7-bit form.  MIME
   [<a href="./rfc2045" title="&quot;Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies&quot;">RFC2045</a>] [<a href="./rfc2046" title="&quot;Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types&quot;">RFC2046</a>] [<a href="./rfc2047" title="&quot;MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text&quot;">RFC2047</a>] provides the ability to use additional
   character sets, but this support is limited to body part data and to
   special encoded-word constructs that were only allowed in a limited
   number of places in header field values.

   Globalization of the Internet requires support of the much larger set
   of characters provided by Unicode [<a href="./rfc5198" title="&quot;Unicode Format for Network Interchange&quot;">RFC5198</a>] in both mail addresses
   and most header field values.  Additionally, complex encoding schemes
   like encoded-words introduce inefficiencies as well as significant
   opportunities for processing errors.  And finally, native support for
   the UTF-8 charset is now available on most systems.  Hence, it is
   strongly desirable for Internet mail to support UTF-8 [<a href="./rfc3629" title="&quot;UTF-8, a transformation format of ISO 10646&quot;">RFC3629</a>]
   directly.

   This document specifies an enhancement to the Internet Message Format
   [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>] and to MIME that permits the direct use of UTF-8, rather
   than only ASCII, in header field values, including mail addresses.  A
   new media type, message/global, is defined for messages that use this
   extended format.  This specification also lifts the MIME restriction
   on having non-identity content-transfer-encodings on any subtype of
   the message top-level type so that message/global parts can be safely
   transmitted across existing mail infrastructure.

   This specification is based on a model of native, end-to-end support
   for UTF-8, which depends on having an "8-bit-clean" environment
   assured by the transport system.  Support for carriage across legacy,
   7-bit infrastructure and for processing by 7-bit receivers requires
   additional mechanisms that are not provided by these specifications.

   This specification is a revision of and replacement for [<a href="./rfc5335" title="&quot;Internationalized Email Headers&quot;">RFC5335</a>].
   <a href="./rfc6530#section-6">Section&nbsp;6 of [RFC6530]</a> describes the change in approach between this
   specification and the previous version.

<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>.  Terminology Used in This Specification</span>

   A plain ASCII string is fully compatible with [<a href="./rfc5321" title="&quot;Simple Mail Transfer Protocol&quot;">RFC5321</a>] and
   [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>].  In this document, non-ASCII strings are UTF-8 strings if
   they are in header field values that contain at least one
   &lt;UTF8-non-ascii&gt; (see <a href="#section-3.1">Section 3.1</a>).





<span class="grey">Yang, et al.                 Standards Track                    [Page 3]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-4" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


   Unless otherwise noted, all terms used here are defined in [<a href="./rfc5321" title="&quot;Simple Mail Transfer Protocol&quot;">RFC5321</a>],
   [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>], [<a href="./rfc6530" title="&quot;Overview and Framework for Internationalized Email&quot;">RFC6530</a>], or [<a href="./rfc6531" title="&quot;SMTP Extension for Internationalized Email&quot;">RFC6531</a>].

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [<a href="./rfc2119" title="&quot;Key words for use in RFCs to Indicate Requirement Levels&quot;">RFC2119</a>].

   The term "8-bit" means octets are present in the data with values
   above 0x7F.

<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>.  Changes to Message Header Fields</span>

   To permit non-ASCII Unicode characters in field values, the header
   definition in [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>] is extended to support the new format.  The
   following sections specify the necessary changes to <a href="./rfc5322">RFC 5322</a>'s ABNF.

   The syntax rules not mentioned below remain defined as in [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>].

   Note that this protocol does not change rules in <a href="./rfc5322">RFC 5322</a> for
   defining header field names.  The bodies of header fields are allowed
   to contain Unicode characters, but the header field names themselves
   must consist of ASCII characters only.

   Also note that messages in this format require the use of the
   SMTPUTF8 extension [<a href="./rfc6531" title="&quot;SMTP Extension for Internationalized Email&quot;">RFC6531</a>] to be transferred via SMTP.

<span class="h3"><a class="selflink" id="section-3.1" href="#section-3.1">3.1</a>.  UTF-8 Syntax and Normalization</span>

   UTF-8 characters can be defined in terms of octets using the
   following ABNF [<a href="./rfc5234" title="&quot;Augmented BNF for Syntax Specifications: ABNF&quot;">RFC5234</a>], taken from [<a href="./rfc3629" title="&quot;UTF-8, a transformation format of ISO 10646&quot;">RFC3629</a>]:

   UTF8-non-ascii  =   UTF8-2 / UTF8-3 / UTF8-4

   UTF8-2          =   &lt;Defined in <a href="./rfc3629#section-4">Section&nbsp;4 of RFC3629</a>&gt;

   UTF8-3          =   &lt;Defined in <a href="./rfc3629#section-4">Section&nbsp;4 of RFC3629</a>&gt;

   UTF8-4          =   &lt;Defined in <a href="./rfc3629#section-4">Section&nbsp;4 of RFC3629</a>&gt;

   See [<a href="./rfc5198" title="&quot;Unicode Format for Network Interchange&quot;">RFC5198</a>] for a discussion of Unicode normalization;
   normalization form NFC [<a href="#ref-UNF" title="&quot;Unicode Standard Annex #15: Unicode Normalization Forms&quot;">UNF</a>] SHOULD be used.  Actually, if one is
   going to do internationalization properly, one of the most often
   cited goals is to permit people to spell their names correctly.
   Since many mailbox local parts reflect personal names, that principle
   applies to mailboxes as well.  The NFKC normalization form [<a href="#ref-UNF" title="&quot;Unicode Standard Annex #15: Unicode Normalization Forms&quot;">UNF</a>]
   SHOULD NOT be used because it may lose information that is needed to
   correctly spell some names in some unusual circumstances.




<span class="grey">Yang, et al.                 Standards Track                    [Page 4]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-5" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


<span class="h3"><a class="selflink" id="section-3.2" href="#section-3.2">3.2</a>.  Syntax Extensions to <a href="./rfc5322">RFC 5322</a></span>

   The following rules extend the ABNF syntax defined in [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>] and
   [<a href="./rfc5234" title="&quot;Augmented BNF for Syntax Specifications: ABNF&quot;">RFC5234</a>] in order to allow UTF-8 content.

   VCHAR   =/  UTF8-non-ascii

   ctext   =/  UTF8-non-ascii

   atext   =/  UTF8-non-ascii

   qtext   =/  UTF8-non-ascii

   text    =/  UTF8-non-ascii
                 ; note that this upgrades the body to UTF-8

   dtext   =/  UTF8-non-ascii

   The preceding changes mean that the following constructs now allow
   UTF-8:

   1.  Unstructured text, used in header fields like "Subject:" or
       "Content-description:".

   2.  Any construct that uses atoms, including but not limited to the
       local parts of addresses and Message-IDs.  This includes
       addresses in the "for" clauses of "Received:" header fields.

   3.  Quoted strings.

   4.  Domains.

   Note that header field names are not on this list; these are still
   restricted to ASCII.

<span class="h3"><a class="selflink" id="section-3.3" href="#section-3.3">3.3</a>.  Use of 8-bit UTF-8 in Message-IDs</span>

   Implementers of Message-ID generation algorithms MAY prefer to
   restrain their output to ASCII since that has some advantages, such
   as when constructing "In-reply-to:" and "References:" header fields
   in mailing-list threads where some senders use internationalized
   addresses and others do not.

<span class="h3"><a class="selflink" id="section-3.4" href="#section-3.4">3.4</a>.  Effects on Line Length Limits</span>

   <a href="./rfc5322#section-2.1.1">Section&nbsp;2.1.1 of [RFC5322]</a> limits lines to 998 characters and
   recommends that the lines be restricted to only 78 characters.  This
   specification changes the former limit to 998 octets.  (Note that, in



<span class="grey">Yang, et al.                 Standards Track                    [Page 5]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-6" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


   ASCII, octets and characters are effectively the same, but this is
   not true in UTF-8.)  The 78-character limit remains defined in terms
   of characters, not octets, since it is intended to address display
   width issues, not line-length issues.

<span class="h3"><a class="selflink" id="section-3.5" href="#section-3.5">3.5</a>.  Changes to MIME Message Type Encoding Restrictions</span>

   This specification updates <a href="./rfc2045#section-6.4">Section&nbsp;6.4 of [RFC2045]</a>.  [<a href="./rfc2045" title="&quot;Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies&quot;">RFC2045</a>]
   prohibits applying a content-transfer-encoding to any subtypes of
   "message/".  This specification relaxes that rule -- it allows newly
   defined MIME types to permit content-transfer-encoding, and it allows
   content-transfer-encoding for message/global (see <a href="#section-3.7">Section 3.7</a>).

   Background: Normally, transfer of message/global will be done in
   8-bit-clean channels, and body parts will have "identity" encodings,
   that is, no decoding is necessary.

   But in the case where a message containing a message/global is
   downgraded from 8-bit to 7-bit as described in [<a href="./rfc6152" title="&quot;SMTP Service Extension for 8-bit MIME Transport&quot;">RFC6152</a>], an encoding
   might have to be applied to the message.  If the message travels
   multiple times between a 7-bit environment and an environment
   implementing these extensions, multiple levels of encoding may occur.
   This is expected to be rarely seen in practice, and the potential
   complexity of other ways of dealing with the issue is thought to be
   larger than the complexity of allowing nested encodings where
   necessary.

<span class="h3"><a class="selflink" id="section-3.6" href="#section-3.6">3.6</a>.  Use of MIME Encoded-Words</span>

   The MIME encoded-words facility [<a href="./rfc2047" title="&quot;MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text&quot;">RFC2047</a>] provides the ability to
   place non-ASCII text, but only in a subset of the places allowed by
   this extension.  Additionally, encoded-words are substantially more
   complex since they allow the use of arbitrary charsets.  Accordingly,
   encoded-words SHOULD NOT be used when generating header fields for
   messages employing this extension.  Agents MAY, when incorporating
   material from another message, convert encoded-word use to direct use
   of UTF-8.

   Note that care must be taken when decoding encoded-words because the
   results after replacing an encoded-word with its decoded equivalent
   in UTF-8 may be syntactically invalid.  Processors that elect to
   decode encoded-words MUST NOT generate syntactically invalid fields.









<span class="grey">Yang, et al.                 Standards Track                    [Page 6]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-7" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


<span class="h3"><a class="selflink" id="section-3.7" href="#section-3.7">3.7</a>.  The message/global Media Type</span>

   Internationalized messages in this format MUST only be transmitted as
   authorized by [<a href="./rfc6531" title="&quot;SMTP Extension for Internationalized Email&quot;">RFC6531</a>] or within a non-SMTP environment that
   supports these messages.  A message is a "message/global message" if:

   o  it contains 8-bit UTF-8 header values as specified in this
      document, or

   o  it contains 8-bit UTF-8 values in the header fields of body parts.

   The content of a message/global part is otherwise identical to that
   of a message/rfc822 part.

   If an object of this type is sent to a 7-bit-only system, it MUST
   have an appropriate content-transfer-encoding applied.  (Note that a
   system compliant with MIME that doesn't recognize message/global is
   supposed to treat it as "application/octet-stream" as described in
   <a href="./rfc2046#section-5.2.4">Section&nbsp;5.2.4 of [RFC2046]</a>.)

   The registration is as follows:

   Type name:  message

   Subtype name:  global

   Required parameters:  none

   Optional parameters:  none

   Encoding considerations:  Any content-transfer-encoding is permitted.
      The 8-bit or binary content-transfer-encodings are recommended
      where permitted.

   Security considerations:  See <a href="#section-4">Section 4</a>.

   Interoperability considerations:  This media type provides
      functionality similar to the message/rfc822 content type for email
      messages with internationalized email headers.  When there is a
      need to embed or return such content in another message, there is
      generally an option to use this media type and leave the content
      unchanged or down-convert the content to message/rfc822.  Each of
      these choices will interoperate with the installed base, but with
      different properties.  Systems unaware of internationalized
      headers will typically treat a message/global body part as an
      unknown attachment, while they will understand the structure of a
      message/rfc822.  However, systems that understand message/global




<span class="grey">Yang, et al.                 Standards Track                    [Page 7]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-8" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


      will provide functionality superior to the result of a down-
      conversion to message/rfc822.  The most interoperable choice
      depends on the deployed software.

   Published specification:  <a href="./rfc6532">RFC 6532</a>

   Applications that use this media type:  SMTP servers and email
      clients that support multipart/report generation or parsing.
      Email clients that forward messages with internationalized headers
      as attachments.

   Additional information:

      Magic number(s):  none

      File extension(s):  The extension ".u8msg" is suggested.

      Macintosh file type code(s):  A uniform type identifier (UTI) of
         "public.utf8-email-message" is suggested.  This conforms to
         "public.message" and "public.composite-content", but does not
         necessarily conform to "public.utf8-plain-text".

   Person &amp; email address to contact for further information:  See the
      Authors' Addresses section of this document.

   Intended usage:  COMMON

   Restrictions on usage:  This is a structured media type that embeds
      other MIME media types.  An 8-bit or binary content-transfer-
      encoding SHOULD be used unless this media type is sent over a
      7-bit-only transport.

   Author:  See the Authors' Addresses section of this document.

   Change controller:  IETF Standards Process

<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>.  Security Considerations</span>

   Because UTF-8 often requires several octets to encode a single
   character, internationalization may cause header field values (in
   general) and mail addresses (in particular) to become longer.  As
   specified in [<a href="./rfc5322" title="&quot;Internet Message Format&quot;">RFC5322</a>], each line of characters MUST be no more than
   998 octets, excluding the CRLF.  On the other hand, MDA (Mail
   Delivery Agent) processes that parse, store, or handle email
   addresses or local parts must take extra care not to overflow
   buffers, truncate addresses, or exceed storage allotments.  Also,
   they must take care, when comparing, to use the entire lengths of the
   addresses.



<span class="grey">Yang, et al.                 Standards Track                    [Page 8]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-9" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


   There are lots of ways to use UTF-8 to represent something equivalent
   or similar to a particular displayed character or group of
   characters; see the security considerations in [<a href="./rfc3629" title="&quot;UTF-8, a transformation format of ISO 10646&quot;">RFC3629</a>] for details
   on the problems this can cause.  The normalization process described
   in <a href="#section-3.1">Section 3.1</a> is recommended to minimize these issues.

   The security impact of UTF-8 headers on email signature systems such
   as Domain Keys Identified Mail (DKIM), S/MIME, and OpenPGP is
   discussed in <a href="./rfc6530#section-14">Section&nbsp;14 of [RFC6530]</a>.

   If a user has a non-ASCII mailbox address and an ASCII mailbox
   address, a digital certificate that identifies that user might have
   both addresses in the identity.  Having multiple email addresses as
   identities in a single certificate is already supported in PKIX
   (Public Key Infrastructure using X.509) [<a href="./rfc5280" title="&quot;Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile&quot;">RFC5280</a>] and OpenPGP
   [<a href="./rfc3156" title="&quot;MIME Security with OpenPGP&quot;">RFC3156</a>], but there may be user-interface issues associated with the
   introduction of UTF-8 into addresses in this context.

<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>.  IANA Considerations</span>

   IANA has updated the registration of the message/global MIME type
   using the registration form contained in <a href="#section-3.7">Section 3.7</a>.

<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>.  Acknowledgements</span>

   This document incorporates many ideas first described in a draft
   document by Paul Hoffman, although many details have changed from
   that earlier work.

   The authors especially thank Jeff Yeh for his efforts and
   contributions on editing previous versions.

   Most of the content of this document was provided by John C Klensin
   and Dave Crocker.  Significant comments and suggestions were received
   from Martin Duerst, Julien Elie, Arnt Gulbrandsen, Kristin Hubner,
   Kari Hurtta, Yangwoo Ko, Charles H. Lindsey, Alexey Melnikov, Chris
   Newman, Pete Resnick, Yoshiro Yoneya, and additional members of the
   Joint Engineering Team (JET) and were incorporated into the document.
   The authors wish to sincerely thank them all for their contributions.












<span class="grey">Yang, et al.                 Standards Track                    [Page 9]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-10" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


<span class="h2"><a class="selflink" id="section-7" href="#section-7">7</a>.  References</span>

<span class="h3"><a class="selflink" id="section-7.1" href="#section-7.1">7.1</a>.  Normative References</span>

   [<a id="ref-ASCII">ASCII</a>]    "Coded Character Set -- 7-bit American Standard Code for
              Information Interchange", ANSI X3.4, 1986.

   [<a id="ref-RFC2119">RFC2119</a>]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", <a href="https://www.rfc-editor.org/bcp/bcp14">BCP 14</a>, <a href="./rfc2119">RFC 2119</a>, March 1997.

   [<a id="ref-RFC3629">RFC3629</a>]  Yergeau, F., "UTF-8, a transformation format of ISO
              10646", STD 63, <a href="./rfc3629">RFC 3629</a>, November 2003.

   [<a id="ref-RFC5198">RFC5198</a>]  Klensin, J. and M. Padlipsky, "Unicode Format for Network
              Interchange", <a href="./rfc5198">RFC 5198</a>, March 2008.

   [<a id="ref-RFC5234">RFC5234</a>]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, <a href="./rfc5234">RFC 5234</a>, January 2008.

   [<a id="ref-RFC5321">RFC5321</a>]  Klensin, J., "Simple Mail Transfer Protocol", <a href="./rfc5321">RFC 5321</a>,
              October 2008.

   [<a id="ref-RFC5322">RFC5322</a>]  Resnick, P., Ed., "Internet Message Format", <a href="./rfc5322">RFC 5322</a>,
              October 2008.

   [<a id="ref-RFC6530">RFC6530</a>]  Klensin, J. and Y. Ko, "Overview and Framework for
              Internationalized Email", <a href="./rfc6530">RFC 6530</a>, February 2012.

   [<a id="ref-RFC6531">RFC6531</a>]  Yao, J. and W. Mao, "SMTP Extension for Internationalized
              Email", <a href="./rfc6531">RFC 6531</a>, February 2012.

   [<a id="ref-UNF">UNF</a>]      Davis, M. and K. Whistler, "Unicode Standard Annex #15:
              Unicode Normalization Forms", September 2010,
              &lt;<a href="http://www.unicode.org/reports/tr15/">http://www.unicode.org/reports/tr15/</a>&gt;.

<span class="h3"><a class="selflink" id="section-7.2" href="#section-7.2">7.2</a>.  Informative References</span>

   [<a id="ref-RFC2045">RFC2045</a>]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part One: Format of Internet Message
              Bodies", <a href="./rfc2045">RFC 2045</a>, November 1996.

   [<a id="ref-RFC2046">RFC2046</a>]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
              Extensions (MIME) Part Two: Media Types", <a href="./rfc2046">RFC 2046</a>,
              November 1996.

   [<a id="ref-RFC2047">RFC2047</a>]  Moore, K., "MIME (Multipurpose Internet Mail Extensions)
              Part Three: Message Header Extensions for Non-ASCII Text",
              <a href="./rfc2047">RFC 2047</a>, November 1996.



<span class="grey">Yang, et al.                 Standards Track                   [Page 10]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-11" ></span>
<span class="grey"><a href="./rfc6532">RFC 6532</a>             Internationalized Email Headers       February 2012</span>


   [<a id="ref-RFC3156">RFC3156</a>]  Elkins, M., Del Torto, D., Levien, R., and T. Roessler,
              "MIME Security with OpenPGP", <a href="./rfc3156">RFC 3156</a>, August 2001.

   [<a id="ref-RFC5280">RFC5280</a>]  Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
              Housley, R., and W. Polk, "Internet X.509 Public Key
              Infrastructure Certificate and Certificate Revocation List
              (CRL) Profile", <a href="./rfc5280">RFC 5280</a>, May 2008.

   [<a id="ref-RFC5335">RFC5335</a>]  Yang, A., "Internationalized Email Headers", <a href="./rfc5335">RFC 5335</a>,
              September 2008.

   [<a id="ref-RFC6152">RFC6152</a>]  Klensin, J., Freed, N., Rose, M., and D. Crocker, "SMTP
              Service Extension for 8-bit MIME Transport", STD 71,
              <a href="./rfc6152">RFC 6152</a>, March 2011.

Authors' Addresses

   Abel Yang
   TWNIC
   4F-2, No. 9, Sec 2, Roosevelt Rd.
   Taipei  100
   Taiwan

   Phone: +886 2 23411313 ext 505
   EMail: abelyang@twnic.net.tw


   Shawn Steele
   Microsoft

   EMail: Shawn.Steele@microsoft.com


   Ned Freed
   Oracle
   800 Royal Oaks
   Monrovia, CA  91016-6347
   USA

   EMail: ned+ietf@mrochek.com











Yang, et al.                 Standards Track                   [Page 11]
</pre>