File: uri.html

package info (click to toggle)
cl-puri 20101006-1
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 200 kB
  • ctags: 84
  • sloc: lisp: 1,500; makefile: 13; sh: 1
file content (406 lines) | stat: -rw-r--r-- 17,074 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
<html>

<head>
<title>URI support in Allegro CL</title>
</head>

<body>

<h1>URI support in Allegro CL</h1>

<p>This document contains the following sections:</p>
<p><a href="#uri-intro-1">1.0 Introduction</a><br>
<a href="#uri-api-1">2.0 The URI API definition</a><br>
<a href="#parsing-decoding-1">3.0 Parsing, escape decoding/encoding and the path</a><br>
<a href="#interning-uris-1">4.0 Interning URIs</a><br>
<a href="#acl-implementation-1">5.0 Allegro CL implementation notes</a><br>
<a href="#examples-1">6.0 Examples</a><br>
</p>

<p>This version of the Allegro CL URI support documentation is for distribution with the
Open Source version of the URI code. Links to Allegro CL documentation other than
URI-specific files have been supressed. To see Allegro CL documentation, see <a
href="http://www.franz.com/support/documentation/">http://www.franz.com/support/documentation/</a>,
which is the Allegro CL documentation page of the franz inc. website. Links to Allegro CL
documentation can be found on that page. </p>

<hr>

<hr>

<h2><a name="uri-intro-1">1.0 Introduction</a></h2>

<p><em>URI</em> stands for <em>Universal Resource Identifier</em>. For a description of
URIs, see RFC2396, which can be found in several places, including the IETF web site (<a
href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</a>) and
the UCI/ICS web site (<a href="http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt">http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt</a>).
We prefer the UCI/ICS one as it has more examples. </p>

<p>URIs are a superset in functionality and syntax to URLs (Universal Resource Locators)
and URNs (Universal Resource Names). That is, RFC2396 updates and merges RFC1738 and
RFC1808 into a single syntax, called the URI. It does exclude some portions of RFC1738
that define specific syntax of individual URL schemes. </p>

<p>In URL slang, the <em>scheme</em> is usually called the `protocol', but it is called
scheme in RFC1738. A URL `host' corresponds to the URI `authority.' The URL slang
`bookmark' or `anchor' is `fragment' in URI lingo. </p>

<p>The URI facility was available as a patch to Allegro CL 5.0.1 and is included with
release 6.0. the URI facility might not be in an Allegro CL image. Evaluate <code>(require
:uri)</code> to ensure the facility is loaded (that form returns <code>nil</code> if the
URI module is already loaded). </p>

<p>Broadly, the URI facility creates a Lisp object that represents a URI, and provides
setters and accessors to fields in the URI object. The URI object can also be interned,
much like symbols in CL are. This document describes the facility and the related
operators. </p>

<p>Aside from the obvious slots which are called out in the RFC, URIs also have a property
list. With interning, this is another similarity between URIs and CL symbols. </p>

<hr>

<hr>

<h2><a name="uri-api-1">2.0 The URI API definition</a></h2>

<p>Symbols naming objects (functions, variables, etc.) in the <em>uri</em> module are
exported from the <code>net.uri</code> package. </p>

<p>URIs are represented by CLOS objects. Their slots are: </p>

<pre>
scheme 
host 
port 
path 
query
fragment 
plist 
</pre>

<p>The <code>host</code> and <code>port</code> slots together correspond to the <code>authority</code>
(see RFC2396). There is an accessor-like function, <a href="operators/uri-authority.htm"><b>uri-authority</b></a>,
that can be used to extract the authority from a URI. See the RFC2396 specifications
pointed to at the beginning of the <a href="#uri-intro-1">1.0 Introduction</a> for details
of all the slots except <code>plist</code>. The <code>plist</code> slot contains a
standard Common Lisp property list. </p>

<p>All symbols are external in the <code>net.uri</code> package, unless otherwise noted.
Brief descriptions are given in this document, with complete descriptions in the
individual pages. 

<ul>
  <li><a href="classes/uri.htm"><code>uri</code></a>: the class of URI objects. </li>
  <li><a href="classes/urn.htm"><code>urn</code></a>: the class of URN objects. </li>
  <li><a href="operators/uri-p.htm"><b>uri-p</b></a> <p><b>Arguments: </b><i>object</i></p>
    <p>Returns true if <i>object</i> is an instance of class <a href="classes/uri.htm"><code>uri</code></a>.
    </p>
  </li>
  <li><a href="operators/copy-uri.htm"><b>copy-uri</b></a> <p><b>Arguments: </b><i>uri </i>&amp;key
    <i>place scheme host port path query fragment plist </i></p>
    <p>Copies the specified URI object. See the description page for information on the
    keyword arguments. </p>
  </li>
  <li><a href="operators/uri-scheme.htm"><b>uri-scheme</b></a><br>
    <a href="operators/uri-host.htm"><b>uri-host</b></a><br>
    <a href="operators/uri-port.htm"><b>uri-port</b></a><br>
    <a href="operators/uri-path.htm"><b>uri-path</b></a><br>
    <a href="operators/uri-query.htm"><b>uri-query</b></a><br>
    <a href="operators/uri-fragment.htm"><b>uri-fragment</b></a><br>
    <a href="operators/uri-plist.htm"><b>uri-plist</b></a><br>
    <p><b>Arguments: </b><i>uri-object </i></p>
    <p>These accessors return the value of the associated slots of the <i>uri-object</i> </p>
  </li>
  <li><a href="operators/uri-authority.htm"><b>uri-authority</b></a> <p><b>Arguments: </b><i>uri-object
    </i></p>
    <p>Returns the authority of <i>uri-object</i>. The authority combines the host and port. </p>
  </li>
  <li><a href="operators/render-uri.htm"><b>render-uri</b></a> <p><b>Arguments: </b><i>uri
    stream </i></p>
    <p>Print to <i>stream</i> the printed representation of <i>uri</i>. </p>
  </li>
  <li><a href="operators/parse-uri.htm"><b>parse-uri</b></a> <p><b>Arguments: </b><i>string </i>&amp;key
    (<i>class</i> 'uri)<i> </i></p>
    <p>Parse <i>string</i> into a URI object. </p>
  </li>
  <li><a href="operators/merge-uris.htm"><b>merge-uris</b></a> <p><b>Arguments: </b><i>uri
    base-uri </i>&amp;optional <i>place </i></p>
    <p>Return an absolute URI, based on <i>uri</i>, which can be relative, and <i>base-uri</i>
    which must be absolute. </p>
  </li>
  <li><a href="operators/enough-uri.htm"><b>enough-uri</b></a> <p><b>Arguments: </b><i>uri
    base </i></p>
    <p>Converts <i>uri</i> into a relative URI using <i>base</i> as the base URI. </p>
  </li>
  <li><a href="operators/uri-parsed-path.htm"><b>uri-parsed-path</b></a> <p><b>Arguments: </b><i>uri
    </i></p>
    <p>Return the parsed representation of the path. </p>
  </li>
  <li><a href="operators/uri.htm"><b>uri</b></a> <p><b>Arguments: </b><i>object </i></p>
    <p>Defined methods: if argument is a uri object, return it; create a uri object if
    possible and return it, or error if not possible. </p>
  </li>
</ul>

<hr>

<hr>

<h2><a name="parsing-decoding-1">3.0 Parsing, escape decoding/encoding and the path</a></h2>

<p>The method <a href="operators/uri-path.htm"><b>uri-path</b></a> returns the path
portion of the URI, in string form. The method <a href="operators/uri-parsed-path.htm"><b>uri-parsed-path</b></a>
returns the path portion of the URI, in list form. This list form is discussed below,
after a discussion of decoding/encoding. </p>

<p>RFC2396 lays out a method for inserting into URIs <em>reserved characters</em>. You do
this by escaping the character. An <em>escaped</em> character is defined like this: </p>

<pre>
escaped = &quot;%&quot; hex hex 

hex = digit | &quot;A&quot; | &quot;B&quot; | &quot;C&quot; | &quot;D&quot; | &quot;E&quot; | &quot;F&quot; | &quot;a&quot; | &quot;b&quot; | &quot;c&quot; | &quot;d&quot; | &quot;e&quot; | &quot;f&quot; 
</pre>

<p>In addition, the RFC defines excluded characters: </p>

<pre>
&quot;&lt;&quot; | &quot;&gt;&quot; | &quot;#&quot; | &quot;%&quot; | &lt;&quot;&gt; | &quot;{&quot; | &quot;}&quot; | &quot;|&quot; | &quot;\&quot; | &quot;^&quot; | &quot;[&quot; | &quot;]&quot; | &quot;`&quot; 
</pre>

<p>The set of reserved characters are: </p>

<pre>
&quot;;&quot; | &quot;/&quot; | &quot;?&quot; | &quot;:&quot; | &quot;@&quot; | &quot;&amp;&quot; | &quot;=&quot; | &quot;+&quot; | &quot;$&quot; | &quot;,&quot; 
</pre>

<p>with the following exceptions: 

<ul>
  <li>within the authority component, the characters &quot;;&quot;, &quot;:&quot;,
    &quot;@&quot;, &quot;?&quot;, and &quot;/&quot; are reserved. </li>
  <li>within a path segment, the characters &quot;/&quot;, &quot;;&quot;, &quot;=&quot;, and
    &quot;?&quot; are reserved. </li>
  <li>within a query component, the characters &quot;;&quot;, &quot;/&quot;, &quot;?&quot;,
    &quot;:&quot;, &quot;@&quot;, &quot;&amp;&quot;, &quot;=&quot;, &quot;+&quot;,
    &quot;,&quot;, and &quot;$&quot; are reserved. </li>
</ul>

<p>From the RFC, there are two important rules about escaping and unescaping (encoding and
decoding): 

<ul>
  <li>decoding should only happen when the URI is parsed into component parts;</li>
  <li>encoding can only occur when a URI is made from component parts (ie, rendered for
    printing). </li>
</ul>

<p>The implication of this is that to decode the URI, it must be in a parsed state. That
is, you can't convert <font face="Courier New">%2f</font> (the escaped form of
&quot;/&quot;) until the path has been parsed into its component parts. Another important
desire is for the application viewing the component parts to see the decoded values of the
components. For example, consider: </p>

<pre>
http://www.franz.com/calculator/3%2f2 
</pre>

<p>This might be the implementation of a calculator, and how someone would execute 3/2.
Clearly, the application that implements this would want to see path components of
&quot;calculator&quot; and &quot;3/2&quot;. &quot;3%2f2&quot; would not be useful to the
calculator application. </p>

<p>For the reasons given above, a parsed version of the path is available and has the
following form: </p>

<pre>
([:absolute | :relative] component1 [component2...]) 
</pre>

<p>where components are: </p>

<pre>
element | (element param1 [param2 ...]) 
</pre>

<p>and <em>element</em> is a path element, and the param's are path element parameters.
For example, the result of </p>

<pre>
(uri-parsed-path (parse-uri &quot;foo;10/bar:x;y;z/baz.htm&quot;)) 
</pre>

<p>is </p>

<pre>
(:relative (&quot;foo&quot; &quot;10&quot;) (&quot;bar:x&quot; &quot;y&quot; &quot;z&quot;) &quot;baz.htm&quot;) 
</pre>

<p>There is a certain amount of canonicalization that occurs when parsing: 

<ul>
  <li>A path of <code>(:absolute)</code> or <code>(:absolute &quot;&quot;)</code> is
    equivalent to a <code>nil</code> path. That is, <code>http://a/</code> is parsed with a <code>nil</code>
    path and printed as <code>http://a</code>. </li>
  <li>Escaped characters that are not reserved are not escaped upon printing. For example, <code>&quot;foob%61r&quot;</code>
    is parsed into <code>&quot;foobar&quot;</code> and appears as <code>&quot;foobar&quot;</code>
    when the URI is printed. </li>
</ul>

<hr>

<hr>

<h2><a name="interning-uris-1">4.0 Interning URIs</a></h2>

<p>This section describes how to intern URIs. Interning is not mandatory. URIs can be used
perfectly well without interning them. </p>

<p>Interned URIs in Allegro are like symbols. That is, a string representing a URI, when
parsed and interned, will always yield an <strong>eq</strong> object. For example: </p>

<pre>
(eq (intern-uri &quot;http://www.franz.com&quot;) 
    (intern-uri &quot;http://www.franz.com&quot;)) 
</pre>

<p>is always true. (Two strings with identical contents may or may not be <strong>eq</strong>
in Common Lisp, note.) </p>

<p>The functions associated with interning are: 

<ul>
  <li><a href="operators/make-uri-space.htm"><b>make-uri-space</b></a> <p><b>Arguments: </b>&amp;key
    <i>size </i></p>
    <p>Make a new hash-table object to contain interned URIs. </p>
  </li>
  <li><a href="operators/uri-space.htm"><b>uri-space</b></a> <p><b>Arguments: </b></p>
    <p>Return the object into which URIs are currently being interned. </p>
  </li>
  <li><a href="operators/uri_eq.htm"><b>uri=</b></a> <p><b>Arguments: </b><i>uri1 uri2 </i></p>
    <p>Returns true if <i>uri1</i> and <i>uri2</i> are equivalent. </p>
  </li>
  <li><a href="operators/intern-uri.htm"><b>intern-uri</b></a> <p><b>Arguments: </b><i>uri-name
    </i>&amp;optional <i>uri-space </i></p>
    <p>Intern the uri object specified in the uri-space specified. Methods exist for strings
    and uri objects. </p>
  </li>
  <li><a href="operators/unintern-uri.htm"><b>unintern-uri</b></a> <p><b>Arguments: </b><i>uri
    </i>&amp;optional <i>uri-space </i></p>
    <p>Unintern the uri object specified or all uri objects (in <i>uri-space</i> if specified)
    if <i>uri</i> is <code>t</code>. </p>
  </li>
  <li><a href="operators/do-all-uris.htm"><b>do-all-uris</b></a> <p><b>Arguments: </b><i>(var </i>&amp;optional
    <i>uri-space result) </i>&amp;body <i>body </i></p>
    <p>Bind <i>var</i> to all currently defined uris (in <i>uri-space</i> if specified) and
    evaluate <i>body</i>. </p>
  </li>
</ul>

<hr>

<hr>

<h2><a name="acl-implementation-1">5.0 Allegro CL implementation notes</a></h2>

<ol>
  <li>The following are true: <br>
    <code>(uri= (parse-uri &quot;http://www.franz.com/&quot;)</code> <br>
    &nbsp; &nbsp; <code>(parse-uri &quot;http://www.franz.com&quot;))</code> <br>
    <code>(eq (intern-uri &quot;http://www.franz.com/&quot;)</code> <br>
    &nbsp;&nbsp; <code>(intern-uri &quot;http://www.franz.com&quot;))</code><br>
  </li>
  <li>The following is true: <br>
    <code>(eq (intern-uri &quot;http://www.franz.com:80/foo/bar.htm&quot;)</code> <br>
    &nbsp; &nbsp; <code>(intern-uri &quot;http://www.franz.com/foo/bar.htm&quot;))</code><br>
    (I.e. specifying the default port is the same as specifying no port at all. This is
    specific in RFC2396.) </li>
  <li>The <em>scheme</em> and <em>authority</em> are case-insensitive. In Allegro CL, the
    scheme is a keyword that appears in the normal case for the Lisp in which you are
    executing. </li>
  <li><code>#u&quot;...&quot;</code> is shorthand for <code>(parse-uri &quot;...&quot;)</code>
    but if an existing <code>#u</code> dispatch macro definition exists, it will not be
    overridden. </li>
  <li>The interaction between setting the scheme, host, port, path, query, and fragment slots
    of URI objects, in conjunction with interning URIs will have very bad and unpredictable
    results. </li>
  <li>The printable representation of URIs is cached, for efficiency. This caching is undone
    when the above slots are changed. That is, when you create a URI the printed
    representation is cached. When you change one of the above mentioned slots, the printed
    representation is cleared and calculated when the URI is next printed. For example: </li>
</ol>

<pre>
user(10): (setq u #u&quot;http://foo.bar.com/foo/bar&quot;) 
#&lt;uri http://foo.bar.com/foo/bar&gt; 
user(11): (setf (net.uri:uri-host u) &quot;foo.com&quot;) 
&quot;foo.com&quot; 
user(12): u 
#&lt;uri http://foo.com/foo/bar&gt; 
user(13): 
</pre>

<p>This allows URIs behavior to follow the principle of least surprise. </p>

<hr>

<hr>

<h2><a name="examples-1">6.0 Examples</a></h2>

<pre>
uri(10): (use-package :net.uri)
t
uri(11): (parse-uri &quot;foo&quot;)
#&lt;uri foo&gt;
uri(12): #u&quot;foo&quot;
#&lt;uri foo&gt;
uri(13): (setq base (intern-uri &quot;http://www.franz.com/foo/bar/&quot;))
#&lt;uri http://www.franz.com/foo/bar/&gt;
uri(14): (merge-uris (parse-uri &quot;foo.htm&quot;) base)
#&lt;uri http://www.franz.com/foo/bar/foo.htm&gt;
uri(15): (merge-uris (parse-uri &quot;?foo&quot;) base)
#&lt;uri http://www.franz.com/foo/bar/?foo&gt;
uri(16): (setq base (intern-uri &quot;http://www.franz.com/foo/bar/baz.htm&quot;))
#&lt;uri http://www.franz.com/foo/bar/baz.htm&gt;
uri(17): (merge-uris (parse-uri &quot;foo.htm&quot;) base)
#&lt;uri http://www.franz.com/foo/bar/foo.htm&gt;
uri(18): (merge-uris #u&quot;?foo&quot; base)
#&lt;uri http://www.franz.com/foo/bar/?foo&gt;
uri(19): (describe #u&quot;http://www.franz.com&quot;)
#&lt;uri http://www.franz.com&gt; is an instance of #&lt;standard-class net.uri:uri&gt;:
 The following slots have :instance allocation:
  scheme        :http
  host          &quot;www.franz.com&quot;
  port          nil
  path          nil
  query         nil
  fragment      nil
  plist         nil
  escaped       nil
  string        &quot;http://www.franz.com&quot;
  parsed-path   nil
  hashcode      nil
uri(20): (describe #u&quot;http://www.franz.com/&quot;)
#&lt;uri http://www.franz.com&gt; is an instance of #&lt;standard-class net.uri:uri&gt;:
 The following slots have :instance allocation:
  scheme        :http
  host          &quot;www.franz.com&quot;
  port          nil
  path          nil
  query         nil
  fragment      nil
  plist         nil
  escaped       nil
  string        &quot;http://www.franz.com&quot;
  parsed-path   nil
  hashcode      nil
uri(21): #u&quot;foobar#baz%23xxx&quot;
#&lt;uri foobar#baz#xxx&gt;
</pre>

<p><small>Copyright (c) 1998-2001, Franz Inc. Berkeley, CA., USA. All rights reserved.
Created 2001.8.16.</small></p>
</body>
</html>