File: rfc2718.txt

package info (click to toggle)
doc-rfc 20181229-2
  • links: PTS, VCS
  • area: non-free
  • in suites: buster
  • size: 570,944 kB
  • sloc: xml: 285,646; sh: 107; python: 90; perl: 42; makefile: 14
file content (563 lines) | stat: -rw-r--r-- 19,208 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563






Network Working Group                                       L. Masinter
Request for Comments: 2718                            Xerox Corporation
Category: Informational                                   H. Alvestrand
                                                   Maxware, Pirsenteret
                                                             D. Zigmond
                                                   WebTV Networks, Inc.
                                                               R. Petke
                                                     UUNET Technologies
                                                          November 1999


                     Guidelines for new URL Schemes

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (1999).  All Rights Reserved.

Abstract

   A Uniform Resource Locator (URL) is a compact string representation
   of the location for a resource that is available via the Internet.
   This document provides guidelines for the definition of new URL
   schemes.

1. Introduction

   A Uniform Resource Locator (URL) is a compact string representation
   of the location for a resource that is available via the Internet.
   RFC 2396 [1] defines the general syntax and semantics of URIs, and,
   by inclusion, URLs.  URLs are designated by including a "<scheme>:"
   and then a "<scheme-specific-part>".  Many URL schemes are already
   defined.

   This document provides guidelines for the definition of new URL
   schemes, for consideration by those who are defining and registering
   or evaluating those definitions.

   The process by which new URL schemes are registered is defined in RFC
   2717 [2].






Masinter, et al.             Informational                      [Page 1]

RFC 2718             Guidelines for new URL Schemes        November 1999


2. Guidelines for new URL schemes

   Because new URL schemes potentially complicate client software, new
   schemes must have demonstrable utility and operability, as well as
   compatibility with existing URL schemes.  This section elaborates
   these criteria.

2.1 Syntactic compatibility

   New URL schemes should follow the same syntactic conventions of
   existing schemes when appropriate.  If a URI scheme that has embedded
   links in content accessed by that scheme does not share syntax with a
   different scheme, the same content cannot be served up under
   different schemes without rewriting the content.  This can already be
   a problem, and with future digital signature schemes, rewriting may
   not even be possible.  Deployment of other schemes in the future
   could therefore become extremely difficult.

2.1.1 Motivations for syntactic compatibility

   Why should new URL schemes share as much of the generic URI syntax
   (that makes sense to share) as possible?  Consider the following:

   o  If fragment syntax isn't shared between two schemes, (e.g. "<a
      href="#foo">"), you can't move individual completely self
      referential documents between schemes without rewriting the
      embedded references within the document.  In the Web, the fragment
      syntax is a property of the media type, and evaluated by the
      client.

   o  If fragment syntax is not shared between different media types of
      the same capability (e.g. HTML, XML, Word, or image types such as
      GIF, JPEG, PNG) then you can't have a URI reference that can
      evolve to superior media types as they become available, or even
      likely work properly today with content negotiation.

   o  If relative syntax (to the extent of understanding the URI is
      relative, and what part of the URI string is relative) isn't
      shared between two schemes, (e.g. "<a href="foo">"), you can't
      move sets of documents that are internally self referential
      between schemes without rewriting the embedded URIs.

   o  If the ".." syntax as a path component in relative URI's isn't
      shared between schemes, you can't easily have sets of document
      sets and refer to them between schemes without rewriting the
      embedded references.





Masinter, et al.             Informational                      [Page 2]

RFC 2718             Guidelines for new URL Schemes        November 1999


   o  If the "/" syntax (to the extent of understanding that the URI
      refers to a path relative to the current naming authority, see
      section 2.1.1) isn't shared, you can't have multiple sets of
      documents easily be moved up or down in a relative hierarchy of
      names and share a common set of documents between them, without
      rewriting the content, shared either in that scheme or between
      schemes.  The best example is a site that has a common set of
      GIF's, JPEG and PNG images, and you want to reorganize the site
      changing the depth of a subtree from one depth to another, or from
      one directory to another where the depth isn't the same.

   o  If naming authority syntax (e.g. what comes after "//" in most URL
      schemes, see section 2.1.1) and relative path syntax is shared, to
      the extent of understanding that the URI has a naming authority,
      and what part of the URI string is the naming authority vs. path),
      isn't shared between two schemes, you can't share identical name
      spaces and serve them up via different schemes.  (The naming
      authority syntax is a property of the scheme).  The fact that
      HTTP, and FTP have the same syntax, for example, has often been
      exploited by sites transitioning from ftp archive service to HTTP
      archive service so that the URL's can be identical between schemes
      except for the scheme; the same content can be served via two
      schemes simultaneously.

2.1.2 Improper use of "//" following "<scheme>:"

   Contrary to some examples set in past years, the use of double
   slashes as the first component of the <scheme-specific-part> of a URL
   is not simply an artistic indicator that what follows is a URL:
   Double slashes are used ONLY when the syntax of the URL's <scheme-
   specific-part> contains a hierarchical structure as described in RFC
   2396.  In URLs from such schemes, the use of double slashes indicates
   that what follows is the top hierarchical element for a naming
   authority.  (See section 3 of RFC 2396 for more details.)  URL
   schemes which do not contain a conformant hierarchical structure in
   their <scheme-specific-part> should not use double slashes following
   the "<scheme>:" string.

2.1.3 Compatibility with relative URLs

   URL schemes should use the generic URL syntax if they are intended to
   be used with relative URLs.  A description of the allowed relative
   forms should be included in the scheme's definition.  Many
   applications use relative URLs extensively.  Specifically,

   o  Can the scheme be parsed according to RFC 2396 - for example, if
      the tokens "//", "/", ";", or "?" are used, do they have the
      meaning given in RFC 2396?



Masinter, et al.             Informational                      [Page 3]

RFC 2718             Guidelines for new URL Schemes        November 1999


   o  Does the scheme make sense to use it in relative URLs like those
      RFC 2396 specifies?

   o  If the scheme syntax is designed to be broken into pieces, does
      the documentation for the scheme's syntax specify what those
      pieces are, why it should be broken in this way, and why the
      breaks aren't where RFC 2396 says that they usually should be?

   o  If the scheme has a hierarchy, does it go left-to-right and with
      slash separators like RFC 2396?

2.2 Is the scheme well defined?

      It is important that the semantics of the "resource" that a URL
      "locates" be well defined.  This might mean different things
      depending on the nature of the URL scheme.

2.2.1 Clear mapping from other name spaces

      In many cases, new URL schemes are defined as ways to translate
      other protocols and name spaces into the general framework of
      URLs.  The "ftp" URL scheme translates from the FTP protocol,
      while the "mid" URL scheme translates from the Message-ID field of
      messages.

      In either case, the description of the mapping must be complete,
      must describe how characters get encoded or not in URLs, must
      describe exactly how all legal values of the base standard can be
      represented using the URL scheme, and exactly which modifiers,
      alternate forms and other artifacts from the base standards are
      included or not included.  These requirements are elaborated
      below.

2.2.2 URL schemes associated with network protocols

      Most new URL schemes are associated with network resources that
      have one or several network protocols that can access them.  The
      'ftp', 'news', and 'http' schemes are of this nature.  For such
      schemes, the specification should completely describe how URLs are
      translated into protocol actions in sufficient detail to make the
      access of the network resource unambiguous.  If an implementation
      of the URL scheme requires some configuration, the configuration
      elements must be clearly identified.  (For example, the 'news'
      scheme, if implemented using NTTP, requires configuration of the
      NTTP server.)






Masinter, et al.             Informational                      [Page 4]

RFC 2718             Guidelines for new URL Schemes        November 1999


2.2.3 Definition of non-protocol URL schemes

      In some cases, URL schemes do not have particular network
      protocols associated with them, because their use is limited to
      contexts where the access method is understood.  This is the case,
      for example, with the "cid" and "mid" URL schemes.  For these URL
      schemes, the specification should describe the notation of the
      scheme and a complete mapping of the locator from its source.

2.2.4 Definition of URL schemes not associated with data resources

      Most URL schemes locate Internet resources that correspond to data
      objects that can be retrieved or modified.  This is the case with
      "ftp" and "http", for example.  However, some URL schemes do not;
      for example, the "mailto" URL scheme corresponds to an Internet
      mail address.

      If a new URL scheme does not locate resources that are data
      objects, the properties of names in the new space must be clearly
      defined.

2.2.5 Character encoding

      When describing URL schemes in which (some of) the elements of the
      URL are actually representations of sequences of characters, care
      should be taken not to introduce unnecessary variety in the ways
      in which characters are encoded into octets and then into URL
      characters.  Unless there is some compelling reason for a
      particular scheme to do otherwise, translating character sequences
      into UTF-8 (RFC 2279) [3] and then subsequently using the %HH
      encoding for unsafe octets is recommended.

2.2.6 Definition of operations

      In some contexts (for example, HTML forms) it is possible to
      specify any one of a list of operations to be performed on a
      specific URL.  (Outside forms, it is generally assumed to be
      something you GET.)

      The URL scheme definition should describe all well-defined
      operations on the URL identifier, and what they are supposed to
      do.

      Some URL schemes (for example, "telnet") provide location
      information for hooking onto bi-directional data streams, and
      don't fit the "infoaccess" paradigm of most URLs very well; this
      should be documented.




Masinter, et al.             Informational                      [Page 5]

RFC 2718             Guidelines for new URL Schemes        November 1999


      NOTE: It is perfectly valid to say that "no operation apart from
      GET is defined for this URL".  It is also valid to say that
      "there's only one operation defined for this URL, and it's not
      very GET-like".  The important point is that what is defined on
      this type is described.

2.3 Demonstrated utility

      URL schemes should have demonstrated utility.  New URL schemes are
      expensive things to support.  Often they require special code in
      browsers, proxies, and/or servers.  Having a lot of ways to say
      the same thing needless complicates these programs without adding
      value to the Internet.

      The kinds of things that are useful include:

   o  Things that cannot be referred to in any other way.

   o  Things where it is much easier to get at them using this scheme
      than (for instance) a proxy gateway.

2.3.1 Proxy into HTTP/HTML

   One way to provide a demonstration of utility is via a gateway which
   provides objects in the new scheme for clients using an existing
   protocol.  It is much easier to deploy gateways to a new service than
   it is to deploy browsers that understand the new URL object.

   Things to look for when thinking about a proxy are:

   o  Is there a single global resolution mechanism whereby any proxy
      can find the referenced object?
   o  If not, is there a way in which the user can find any object of
      this type, and "run his own proxy"?
   o  Are the operations mappable one-to-one (or possibly using
      modifiers) to HTTP operations?
   o  Is the type of returned objects well defined?
      - as MIME content-types?
      - as something that can be translated to HTML?
   o  Is there running code for a proxy?











Masinter, et al.             Informational                      [Page 6]

RFC 2718             Guidelines for new URL Schemes        November 1999


2.4 Are there security considerations?

   Above and beyond the security considerations of the base mechanism a
   scheme builds upon, one must think of things that can happen in the
   normal course of URL usage.

   In particular:

   o  Does the user need to be warned that such a thing is happening
      without an explicit request (GET for the source of an IMG tag, for
      instance)?  This has implications for the design of a proxy
      gateway, of course.

   o  Is it possible to fake URLs of this type that point to different
      things in a dangerous way?

   o  Are there mechanisms for identifying the requester that can be
      used or need to be used with this mechanism (the From: field in a
      mailto: URL, or the Kerberos login required for AFS access in the
      AFS: URL, for instance)?

   o  Does the mechanism contain passwords or other security information
      that are passed inside the referring document in the clear (as in
      the "ftp" URL, for instance)?

2.5 Does it start with UR?

   Any scheme starting with the letters "U" and "R", in particular if it
   attaches any of the meanings "uniform", "universal" or "unifying" to
   the first letter, is going to cause intense debate, and generate much
   heat (but maybe little light).

   Any such proposal should either make sure that there is a large
   consensus behind it that it will be the only scheme of its type, or
   pick another name.

2.6 Non-considerations

   Some issues that are often raised but are not relevant to new URL
   schemes include the following.











Masinter, et al.             Informational                      [Page 7]

RFC 2718             Guidelines for new URL Schemes        November 1999


2.6.1 Are all objects accessible?

   Can all objects in the world that are validly identified by a scheme
   be accessed by any UA implementing it?

   Sometimes the answer will be yes and sometimes no; often it will
   depend on factors (like firewalls or client configuration) not
   directly related to the scheme itself.

3. Security Considerations

   New URL schemes are required to address all security considerations
   in their definitions.

4. References

   [1] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
       Identifiers (URI): Generic Syntax", RFC 2396, August 1998.

   [2] Petke, R. and I. King, "Registration Procedures for URL Scheme
       Names", BCP 35, RFC 2717, November 1999.

   [3] Yergeau, F., "UTF-8, A Transformation Format of Unicode and ISO
       10646", RFC 2279, January 1998.



























Masinter, et al.             Informational                      [Page 8]

RFC 2718             Guidelines for new URL Schemes        November 1999


5. Authors' Addresses

   Larry Masinter
   Xerox Corporation
   Palo Alto Research Center
   3333 Coyote Hill Road
   Palo Alto, CA 94304

   URL: http://purl.org/NET/masinter
   EMail: masinter@parc.xerox.com


   Harald Tveit Alvestrand
   Maxware, Pirsenteret
   N-7005 Trondheim
   NORWAY

   Phone: +47 73 54 57 00
   EMail: harald.alvestrand@maxware.no


   Dan Zigmond
   WebTV Networks, Inc.
   305 Lytton Avenue
   Palo Alto, CA 94301
   USA

   Phone: +1-650-614-6071
   EMail: djz@corp.webtv.net


   Rich Petke
   UUNET Technologies
   5000 Britton Road
   P. O. Box 5000
   Hilliard, OH 43026-5000

   Phone: +1-614-723-4157
   Fax: +1-614-723-8407
   EMail: rpetke@wcom.net











Masinter, et al.             Informational                      [Page 9]

RFC 2718             Guidelines for new URL Schemes        November 1999


6. Full Copyright Statement

   Copyright (C) The Internet Society (1999).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.



















Masinter, et al.             Informational                     [Page 10]