1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669
|
<pre>Internet Architecture Board (IAB) A. Sullivan
Request for Comments: 6912 Dyn, Inc.
Category: Informational D. Thaler
ISSN: 2070-1721 Microsoft
J. Klensin
O. Kolkman
NLnet Labs
April 2013
<span class="h1">Principles for Unicode Code Point Inclusion in Labels in the DNS</span>
Abstract
Internationalized Domain Names in Applications (IDNA) makes available
to DNS zone administrators a very wide range of Unicode code points.
Most operators of zones should probably not permit registration of
U-labels using the entire range. This is especially true of zones
that accept registrations across organizational boundaries, such as
top-level domains and, most importantly, the root. It is
unfortunately not possible to generate algorithms to determine
whether permitting a code point presents a low risk. This memo
presents a set of principles that can be used to guide the decision
of whether a Unicode code point may be wisely included in the
repertoire of permissible code points in a U-label in a zone.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Architecture Board (IAB)
and represents information that the IAB has deemed valuable to
provide for permanent record. It represents the consensus of the
Internet Architecture Board (IAB). Documents approved for
publication by the IAB are not a candidate for any level of Internet
Standard; see <a href="./rfc5741#section-2">Section 2 of RFC 5741</a>.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
<a href="http://www.rfc-editor.org/info/rfc6912">http://www.rfc-editor.org/info/rfc6912</a>.
<span class="grey">Sullivan, et al. Informational [Page 1]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-2" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to <a href="https://www.rfc-editor.org/bcp/bcp78">BCP 78</a> and the IETF Trust's Legal
Provisions Relating to IETF Documents
(<a href="http://trustee.ietf.org/license-info">http://trustee.ietf.org/license-info</a>) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents
<a href="#section-1">1</a>. Introduction . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-1.1">1.1</a>. Terminology . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-3">3</a>
<a href="#section-2">2</a>. Background . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-4">4</a>
<a href="#section-2.1">2.1</a>. More-Restrictive Rules Going Up the DNS Tree . . . . . . <a href="#page-6">6</a>
<a href="#section-3">3</a>. Principles Applicable to All Zones . . . . . . . . . . . . . <a href="#page-6">6</a>
<a href="#section-3.1">3.1</a>. Longevity Principle . . . . . . . . . . . . . . . . . . . <a href="#page-6">6</a>
<a href="#section-3.2">3.2</a>. Least Astonishment Principle . . . . . . . . . . . . . . <a href="#page-6">6</a>
<a href="#section-3.3">3.3</a>. Contextual Safety Principle . . . . . . . . . . . . . . . <a href="#page-7">7</a>
<a href="#section-4">4</a>. Principles Applicable to All Public Zones . . . . . . . . . . <a href="#page-7">7</a>
<a href="#section-4.1">4.1</a>. Conservatism Principle . . . . . . . . . . . . . . . . . <a href="#page-7">7</a>
<a href="#section-4.2">4.2</a>. Inclusion Principle . . . . . . . . . . . . . . . . . . . <a href="#page-7">7</a>
<a href="#section-4.3">4.3</a>. Simplicity Principle . . . . . . . . . . . . . . . . . . <a href="#page-7">7</a>
<a href="#section-4.4">4.4</a>. Predictability Principle . . . . . . . . . . . . . . . . <a href="#page-8">8</a>
<a href="#section-4.5">4.5</a>. Stability Principle . . . . . . . . . . . . . . . . . . . <a href="#page-8">8</a>
<a href="#section-5">5</a>. Principle Specific to the Root Zone . . . . . . . . . . . . . <a href="#page-8">8</a>
<a href="#section-5.1">5.1</a>. Letter Principle . . . . . . . . . . . . . . . . . . . . <a href="#page-8">8</a>
<a href="#section-6">6</a>. Confusion and Context . . . . . . . . . . . . . . . . . . . . <a href="#page-9">9</a>
<a href="#section-7">7</a>. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . <a href="#page-9">9</a>
<a href="#section-8">8</a>. Security Considerations . . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>
<a href="#section-9">9</a>. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>
<a href="#section-10">10</a>. IAB Members at the Time of Approval . . . . . . . . . . . . . <a href="#page-10">10</a>
<a href="#section-11">11</a>. Informative References . . . . . . . . . . . . . . . . . . . <a href="#page-10">10</a>
<span class="grey">Sullivan, et al. Informational [Page 2]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-3" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>. Introduction</span>
Operators of a DNS zone need to set policies around what Unicode code
points are allowed in labels in that zone. Typically there are a
number of important goals to consider when constructing such
policies. These include, for instance, avoiding possible visual
confusability between two labels, avoiding possible confusion between
Fully Qualified Domain Names (FQDNs) and IP address literals,
accessibility to the disabled (see "Web Content Accessibility
Guidelines (WCAG) 2.0" [<a href="#ref-WCAG20" title=""Web Content Accessibility Guidelines (WCAG) 2.0"">WCAG20</a>] for some discussion in a web
context), and other usability issues.
This document provides a set of principles that zone operators can
use to construct their code point policies in order to improve
usability and clarity and thereby reduce confusion.
<span class="h3"><a class="selflink" id="section-1.1" href="#section-1.1">1.1</a>. Terminology</span>
This document uses the following terms.
A-label: an LDH label that starts with "xn--" and meets all the
IDNA requirements, with additional restrictions as explained in
<a href="#section-2.3.2.1">Section 2.3.2.1</a> of the IDNA Definitions document [<a href="./rfc5890" title=""Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework"">RFC5890</a>].
Character: a member of a set of elements used for the
organization, control, or representation of data. See <a href="#section-2">Section 2</a>
of the Internationalization Terminology document [<a href="./rfc6365" title=""Terminology Used in Internationalization in the IETF"">RFC6365</a>] for
more details.
Language: a way that humans communicate. The use of language
occurs in many forms, the most common of which are speech,
writing, and signing. See <a href="./rfc6365#section-2">Section 2 of RFC 6365</a> for more details.
LDH label: a string consisting of ASCII letters, digits, and the
hyphen, with additional restrictions as explained in <a href="./rfc5890#section-2.3.1">Section 2.3.1
of RFC 5890</a>.
Public zone: in this document, a DNS zone that accepts
registration requests from organizations outside the zone
administrator's own organization. (Whether the zone performs
delegation is a separate question. What is important is the
diversity of the registration-requesting community.) Note that
under this definition, the root zone is a public zone, though one
that has a unique function in the DNS.
Rendering: the display of a string of text. See Section 5 of <a href="./rfc6365">RFC</a>
<a href="./rfc6365">6365</a> for more details.
<span class="grey">Sullivan, et al. Informational [Page 3]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-4" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
Script: a set of graphic characters used for the written form of
one or more languages. See <a href="./rfc6365#section-2">Section 2 of RFC 6365</a> for more
details.
U-label: a string of Unicode characters that meets all the IDNA
requirements and includes at least one non-ASCII character, with
additional restrictions as explained in Section 2.3.2.1 of <a href="./rfc5890">RFC</a>
<a href="./rfc5890">5890</a>.
Writing system: a set of rules for using one or more scripts to
write a particular language. See <a href="./rfc6365#section-2">Section 2 of RFC 6365</a> for more
details.
This memo does not propose a protocol standard, and the use of words
such as "should" follow the ordinary English meaning, and not that
laid out in [<a href="./rfc2119" title=""Key words for use in RFCs to Indicate Requirement Levels"">RFC2119</a>].
<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>. Background</span>
In recent communications [<a href="#ref-IABCOMM1" title=""IAB Statement: 'The interpretation of rules in the ICANN gTLD Applicant Guidebook'"">IABCOMM1</a>] [<a href="#ref-IABCOMM2" title=""Response to ICANN questions concerning 'The interpretation of rules in the ICANN gTLD Applicant Guidebook'"">IABCOMM2</a>], the IAB has
emphasized the importance of conservatism in allocating labels
conforming to IDNA2008 [<a href="./rfc5890" title=""Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework"">RFC5890</a>] [<a href="./rfc5891" title=""Internationalized Domain Names in Applications (IDNA): Protocol"">RFC5891</a>] [<a href="./rfc5892" title=""The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)"">RFC5892</a>] [<a href="./rfc5893" title=""Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA)"">RFC5893</a>]
[<a href="./rfc5894" title=""Internationalized Domain Names for Applications (IDNA): Background, Explanation, and Rationale"">RFC5894</a>] [<a href="./rfc5895" title=""Mapping Characters for Internationalized Domain Names in Applications (IDNA) 2008"">RFC5895</a>] in DNS zones, and especially in the root zone.
Traditional LDH labels in the root zone used only alphabetic
characters (i.e., ASCII a-z, which under the DNS also match A-Z).
Matters are more complicated with U-labels, however. The IAB
communications recommended that U-labels permit only code points with
a General_Category (gc) of Ll (Lowercase_Letter), Lo (Other_Letter),
or Lm (Modifier_Letter), but noted that for practical considerations
other code points might be permitted on a case-by-case basis.
The IAB recommendations do, however, leave some issues open that need
to be addressed. It is not clear that all code points permitted
under IDNA2008 that have a General_Category of Lo or Lm are
appropriate for a zone such as the root zone. To take but one
example, the code point U+02BC (MODIFIER LETTER APOSTROPHE) has a
General_Category of Lm. In practically every rendering (and we are
unaware of an exception), U+02BC is indistinguishable from U+2019
(RIGHT SINGLE QUOTATION MARK), which has a General_Category of Pf
(Final_Punctuation). U+02BC will also be read by large numbers of
people as being the same character as U+0027 (APOSTROPHE), which has
a General_Category of Po (Other_Punctuation), and some computer
systems may treat U+02BC as U+0027. U+02BC is PROTOCOL VALID
(PVALID) under IDNA2008 (see the IDNA Code Points document
[<a href="./rfc5892" title=""The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)"">RFC5892</a>]), whereas both other code points are DISALLOWED. So, to
begin with, it is plain that not every code point with a
<span class="grey">Sullivan, et al. Informational [Page 4]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-5" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
General_Category of Ll, Lo, or Lm is consistent with the type of
conservatism principle discussed in <a href="#section-4.1">Section 4.1</a> below or the previous
IAB recommendations.
To make matters worse, some languages are dependent on code points
with General_Category Mc (Spacing_Mark) or General_Category Mn
(Nonspacing_Mark). This dependency is particularly common in Indic
languages, though not exclusive to them. (At the risk of vastly
oversimplifying, the overarching issue is mostly the interaction of
complex writing systems and the way Unicode works.) To restrict
users of those languages to only code points with General_Category of
Ll, Lo, or Lm would be extremely limiting. While DNS labels are not
words, or sentences, or phrases (as noted in the next steps for IDN
[<a href="./rfc4690" title=""Review and Recommendations for Internationalized Domain Names (IDNs)"">RFC4690</a>]), they are intended to support useful mnemonics. Mnemonics
that diverge wildly from the usual conventions are poor ones, because
in not following the usual conventions they are not easy to remember.
Also, wide divergence from usual conventions, if not well-justified
(and especially in a shared namespace like the root), invites
political controversy.
Many of the issues above turn out to be relevant to all public zones.
Moreover, the overall issue of developing a policy for code point
permission is common to all zones that accept A-labels or U-labels
for registration. As <a href="#section-4.3">Section 4.3</a> of the IDNA Protocol document
[<a href="./rfc5891" title=""Internationalized Domain Names in Applications (IDNA): Protocol"">RFC5891</a>] says, every registry at every level of the DNS is "expected
to establish policies about label registrations".
For reasons of sound management, it is not desirable to decide
whether to permit a given code point only when an application
containing that code point is pending. That approach reduces
predictability and is bound to appear subject to special pleas. It
is better instead to produce the rules governing acceptance of code
points in advance.
As is evident from the foregoing discussion about the Letter and Mark
categories, it is simply not possible to make code point decisions
algorithmically. If it were possible to develop such an algorithm,
it would already exist: the DNS is hardly unique in needing to impose
restrictions on code points while accommodating many different
linguistic communities. Nevertheless, new guidelines can be made by
starting from overarching principles. These guidelines act more as
meta-rules, leading to the establishment of other rules about the
inclusion and exclusion of particular code points in labels in a
given zone, always based on the list of code points permitted by
IDNA.
<span class="grey">Sullivan, et al. Informational [Page 5]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-6" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
<span class="h3"><a class="selflink" id="section-2.1" href="#section-2.1">2.1</a>. More-Restrictive Rules Going Up the DNS Tree</span>
A set of principles derived from the above ideas follows in Sections
3 through 5 below. Such principles fall into three categories. Some
principles apply to every DNS zone. Some additional principles apply
to all public zones, including the root zone. Finally, other
principles apply only to the root zone. This means that zones higher
in the DNS tree tend to have more restrictive rules (since additional
principles apply), and zones lower in the DNS tree tend to have less
restrictive rules, since they are used within a more narrow context.
In general, the relevant context for a principle is that of the zone,
not that of a given subset of the user community; for the root zone,
for example, the context is "the entire Internet population".
<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>. Principles Applicable to All Zones</span>
<span class="h3"><a class="selflink" id="section-3.1" href="#section-3.1">3.1</a>. Longevity Principle</span>
Unicode properties of a code point ought to be stable across the
versions of Unicode that users of the zone are likely to have
installed. Because it is possible for the properties of a code point
to change between Unicode versions, a good way to predict such
stability is to ensure that a code point has in fact been stable for
multiple successive versions of Unicode. This principle is related
to the Stability Principle in <a href="#section-4.5">Section 4.5</a>.
The more diverse the community using the zone, the greater the
importance of following this principle. The policy for a leaf zone
in the DNS might only require stability across two Unicode versions,
whereas a more public zone might require stability across four or
more releases before the code point's properties are considered long-
lived and stable.
<span class="h3"><a class="selflink" id="section-3.2" href="#section-3.2">3.2</a>. Least Astonishment Principle</span>
Every zone administrator should be sensitive to the likely use of a
code point to be permitted, particularly taking into account the
population likely to use the zone. Zone administrators should
especially consider whether a candidate code point could present
difficulty if the code point is encountered outside the usual
linguistic circumstances. By the same token, the failure to support
a code point that is normal in some linguistic circumstances could be
very surprising for users likely to encounter the names in that
circumstance.
<span class="grey">Sullivan, et al. Informational [Page 6]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-7" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
<span class="h3"><a class="selflink" id="section-3.3" href="#section-3.3">3.3</a>. Contextual Safety Principle</span>
Every zone administrator should be sensitive to ways in which a code
point that is permitted could be used in support of malicious
activity. This is not a completely new problem: the digit 1 and the
lowercase letter l are, for instance, easily confused in many
contexts. The very large repertoire of code points in Unicode (even
just the subset permitted for IDNs) makes the problem somewhat worse,
just because of the scale.
<span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>. Principles Applicable to All Public Zones</span>
<span class="h3"><a class="selflink" id="section-4.1" href="#section-4.1">4.1</a>. Conservatism Principle</span>
Public zones are, by definition, zones that are shared by different
groups of people. Therefore, any decision to permit a code point in
a public zone (including the root) should be as conservative as
practicable. Doubts should always be resolved in favor of rejecting
a code point for inclusion rather than in favor of including it, in
order to minimize risk.
<span class="h3"><a class="selflink" id="section-4.2" href="#section-4.2">4.2</a>. Inclusion Principle</span>
Just as IDNA2008 starts from the principle that the Unicode range is
excluded, and then adds code points according to derived properties
of the code points, so a public zone should only permit inclusion of
a code point if it is known to be "safe" in terms of usability and
confusability within the context of that zone. The default treatment
of a code point should be that it is excluded.
<span class="h3"><a class="selflink" id="section-4.3" href="#section-4.3">4.3</a>. Simplicity Principle</span>
The rules for determining whether a code point is to be included
should be simple enough that they are readily understood by someone
with a moderate background in the DNS and Unicode issues. This
principle does not mean that a completely naive person needs to be
able to understand the rationale for including a code point, but it
does mean that if the reason for inclusion of a very peculiar code
point, even a safe one, is too difficult to understand, the code
point would not be permitted.
The meaning of "simple" or "readily understood" is context-dependent.
For instance, the root zone has to serve everyone in the world; for
practical purposes, this means that the reasons for including a code
point need to be comprehensible even to people who cannot use the
script where the code point is found. In a zone that permits a
constrained subset of Unicode characters (for instance, only those
needed to write a single alphabetic language) and that supports a
<span class="grey">Sullivan, et al. Informational [Page 7]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-8" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
clearly delineated linguistic community (for instance, the speakers
of a single language with well-understood written conventions), more
complicated rules might be acceptable. Compare this principle with
the Least Astonishment Principle in <a href="#section-3.2">Section 3.2</a>.
<span class="h3"><a class="selflink" id="section-4.4" href="#section-4.4">4.4</a>. Predictability Principle</span>
The rules for determining whether a code point is to be included
should be predictable enough that those with the requisite
understanding of DNS, IDNA, and Unicode will usually reach the same
conclusion. This is not a requirement for algorithmic treatment of
code points; as previously noted, that is not possible. Rather, it
is to say that the consistent application of professional judgment is
likely to yield the same results; combined with the principle in
<a href="#section-4.1">Section 4.1</a>, when results are not predictable, the anomalous code
point would not be permitted.
Just as in <a href="#section-4.3">Section 4.3</a>, this principle tends to cause more
restriction the more diverse the community using the zone; it is most
restrictive for the root zone. This is because what is predictable
within a given language community is possibly very surprising across
languages.
<span class="h3"><a class="selflink" id="section-4.5" href="#section-4.5">4.5</a>. Stability Principle</span>
Once a code point is permitted, it is at least very hard to stop
permitting that code point. In public zones (including the root),
the list of code points to be permitted should change very slowly, if
at all, and usually only in the direction of permitting an addition
as time and experience indicate that inclusion of such a code point
is both safe and consistent with these principles.
<span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>. Principle Specific to the Root Zone</span>
<span class="h3"><a class="selflink" id="section-5.1" href="#section-5.1">5.1</a>. Letter Principle</span>
"Requirements for Internet Hosts - Application and Support" [<a href="./rfc1123" title=""Requirements for Internet Hosts - Application and Support"">RFC1123</a>]
notes that top-level labels "will be alphabetic". In the absence of
widespread agreement about the force of that note, prudence suggests
that U-labels in the root zone should exclude code points that are
not normally used to write words, or that are in some cases normally
used for purposes other than writing words. This is not the same as
using Unicode's General_Category to include only letters. It is a
restriction that expands the possible class of included code points
beyond the Unicode letters, but only expands so far as to include the
things that are normally used to create words. Under this principle,
code points with (for example) General_Category Mn (Nonspacing_Mark)
might be included -- but only those that are used to write words and
<span class="grey">Sullivan, et al. Informational [Page 8]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-9" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
not (for instance) musical symbols. In addition, such marks should
only be used within a label in ways that they would be used when
making a word: combinations that would be nonsense when used in a
word should also be rejected when tried in DNS labels. This
principle should be applied as narrowly as possible; as the next
steps for IDN document [<a href="./rfc4690" title=""Review and Recommendations for Internationalized Domain Names (IDNs)"">RFC4690</a>] says, "While DNS labels may
conveniently be used to express words in many circumstances, the goal
is not to express words (or sentences or phrases), but to permit the
creation of unambiguous labels with good mnemonic value".
<span class="h2"><a class="selflink" id="section-6" href="#section-6">6</a>. Confusion and Context</span>
While many discussions of confusion have focused on characters, e.g.,
whether two characters are confusable with each other (and under what
circumstances), a focus on characters alone could lead to the
prohibition of very large numbers of labels, including many that
present little risk. Instead, the focus should be on whether one
label is confusable with another. For example, if a label contains
several characters that are distinct to a particular script, and all
of its characters are from that script, it is inherently not
confusable with a label from any other script no matter what other
characters might appear in it. Another label that lacks those
distinguishing characters might be a problem. The notion extends
from labels to domain names, in the sense that distinguishing
characters used in a higher-level label may set expectations with
respect to the characters in the lower-level labels. This
expectation might be regarded as a benefit, but it is also a problem,
since there is no technical way to require consistent policies in
delegated namespaces.
<span class="h2"><a class="selflink" id="section-7" href="#section-7">7</a>. Conclusion</span>
The principles outlined in this document can be applied when
considering any range of Unicode code points for possible inclusion
in a DNS zone. It is worth observing that doing anything (especially
in light of <a href="#section-4.5">Section 4.5</a>) implicitly disadvantages communities with a
writing system not yet well understood and not represented in the
technical and policy communities involved in the discussion. That
disadvantage is to be guarded against as much as practical, but is
effectively impossible to prevent (while still taking action) in
light of imperfect human knowledge.
<span class="grey">Sullivan, et al. Informational [Page 9]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-10" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
<span class="h2"><a class="selflink" id="section-8" href="#section-8">8</a>. Security Considerations</span>
The principles outlined in this memo are intended to improve
usability and clarity and thereby reduce confusion among different
labels. While these principles may contribute to reduction of risk,
they are not sufficient to provide a comprehensive
internationalization policy for zone management.
Additional discussion of security considerations can be found in the
Unicode Security Considerations [<a href="#ref-UTR36" title=""Unicode Security Considerations"">UTR36</a>].
<span class="h2"><a class="selflink" id="section-9" href="#section-9">9</a>. Acknowledgements</span>
The authors thank the participants in the IAB Internationalization
program for the discussion of the ideas in this memo, particularly
Marc Blanchet. In addition, Stephane Bortzmeyer, Paul Hoffman,
Daniel Kalchev, Panagiotis Papaspiliopoulos, and Vaggelis Segredakis
made specific comments.
<span class="h2"><a class="selflink" id="section-10" href="#section-10">10</a>. IAB Members at the Time of Approval</span>
Bernard Aboba
Jari Arkko
Marc Blanchet
Ross Callon
Alissa Cooper
Spencer Dawkins
Joel Halpern
Russ Housley
David Kessens
Danny McPherson
Jon Peterson
Dave Thaler
Hannes Tschofenig
<span class="h2"><a class="selflink" id="section-11" href="#section-11">11</a>. Informative References</span>
[<a id="ref-IABCOMM1">IABCOMM1</a>] Internet Architecture Board, "IAB Statement: 'The
interpretation of rules in the ICANN gTLD Applicant
Guidebook'", February 2012, <<a href="http://www.iab.org/documents/correspondence-reports-documents/201/">http://www.iab.org/</a>
<a href="http://www.iab.org/documents/correspondence-reports-documents/201/">documents/correspondence-reports-documents/201/</a>>.
[<a id="ref-IABCOMM2">IABCOMM2</a>] Internet Architecture Board, "Response to ICANN questions
concerning 'The interpretation of rules in the ICANN gTLD
Applicant Guidebook'", March 2012, <<a href="http://www.iab.org/documents/correspondence-reports-documents/201/">http://www.iab.org/</a>
<a href="http://www.iab.org/documents/correspondence-reports-documents/201/">documents/correspondence-reports-documents/201/</a>>.
<span class="grey">Sullivan, et al. Informational [Page 10]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-11" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
[<a id="ref-RFC1123">RFC1123</a>] Braden, R., "Requirements for Internet Hosts - Application
and Support", STD 3, <a href="./rfc1123">RFC 1123</a>, October 1989.
[<a id="ref-RFC2119">RFC2119</a>] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", <a href="https://www.rfc-editor.org/bcp/bcp14">BCP 14</a>, <a href="./rfc2119">RFC 2119</a>, March 1997.
[<a id="ref-RFC4690">RFC4690</a>] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and
Recommendations for Internationalized Domain Names
(IDNs)", <a href="./rfc4690">RFC 4690</a>, September 2006.
[<a id="ref-RFC5890">RFC5890</a>] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Definitions and Document Framework",
<a href="./rfc5890">RFC 5890</a>, August 2010.
[<a id="ref-RFC5891">RFC5891</a>] Klensin, J., "Internationalized Domain Names in
Applications (IDNA): Protocol", <a href="./rfc5891">RFC 5891</a>, August 2010.
[<a id="ref-RFC5892">RFC5892</a>] Faltstrom, P., "The Unicode Code Points and
Internationalized Domain Names for Applications (IDNA)",
<a href="./rfc5892">RFC 5892</a>, August 2010.
[<a id="ref-RFC5893">RFC5893</a>] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for
Internationalized Domain Names for Applications (IDNA)",
<a href="./rfc5893">RFC 5893</a>, August 2010.
[<a id="ref-RFC5894">RFC5894</a>] Klensin, J., "Internationalized Domain Names for
Applications (IDNA): Background, Explanation, and
Rationale", <a href="./rfc5894">RFC 5894</a>, August 2010.
[<a id="ref-RFC5895">RFC5895</a>] Resnick, P. and P. Hoffman, "Mapping Characters for
Internationalized Domain Names in Applications (IDNA)
2008", <a href="./rfc5895">RFC 5895</a>, September 2010.
[<a id="ref-RFC6365">RFC6365</a>] Hoffman, P. and J. Klensin, "Terminology Used in
Internationalization in the IETF", <a href="https://www.rfc-editor.org/bcp/bcp166">BCP 166</a>, <a href="./rfc6365">RFC 6365</a>,
September 2011.
[<a id="ref-UTR36">UTR36</a>] Davis, M. and M. Suignard, "Unicode Security
Considerations", Unicode Technical Report #36, July 2012.
[<a id="ref-WCAG20">WCAG20</a>] W3C, "Web Content Accessibility Guidelines (WCAG) 2.0",
W3C Recommendation, December 2008,
<<a href="http://www.w3.org/TR/2008/REC-WCAG20-20081211/">http://www.w3.org/TR/2008/REC-WCAG20-20081211/</a>>.
<span class="grey">Sullivan, et al. Informational [Page 11]</span></pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span id="page-12" ></span>
<span class="grey"><a href="./rfc6912">RFC 6912</a> DNS Zone Code Point Principles April 2013</span>
Authors' Addresses
Andrew Sullivan
Dyn, Inc.
150 Dow St
Manchester, NH 03101
USA
EMail: asullivan@dyn.com
Dave Thaler
Microsoft
One Microsoft Way
Redmond, WA 98052
USA
EMail: dthaler@microsoft.com
John C Klensin
1770 Massachusetts Ave, Ste 322
Cambridge, MA 02140
USA
Phone: +1 617 491 5735
EMail: john-ietf@jck.com
Olaf Kolkman
NLnet Labs
Science Park 400
Amsterdam 1098 XH
The Netherlands
EMail: olaf@NLnetLabs.nl
Sullivan, et al. Informational [Page 12]
</pre>
|