File: xsxp.n

package info (click to toggle)
tcllib 2.0%2Bdfsg-4
  • links: PTS
  • area: main
  • in suites:
  • size: 83,572 kB
  • sloc: tcl: 306,798; ansic: 14,272; sh: 3,035; xml: 1,766; yacc: 1,157; pascal: 881; makefile: 124; perl: 84; f90: 84; python: 33; ruby: 13; php: 11
file content (430 lines) | stat: -rw-r--r-- 12,213 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
'\"
'\" Generated from file 'xsxp\&.man' by tcllib/doctools with format 'nroff'
'\" 2006 Darren New\&. All Rights Reserved\&.
'\"
.TH "xsxp" n 1\&.1 tcllib "Amazon S3 Web Service Utilities"
.\" The -*- nroff -*- definitions below are for supplemental macros used
.\" in Tcl/Tk manual entries.
.\"
.\" .AP type name in/out ?indent?
.\"	Start paragraph describing an argument to a library procedure.
.\"	type is type of argument (int, etc.), in/out is either "in", "out",
.\"	or "in/out" to describe whether procedure reads or modifies arg,
.\"	and indent is equivalent to second arg of .IP (shouldn't ever be
.\"	needed;  use .AS below instead)
.\"
.\" .AS ?type? ?name?
.\"	Give maximum sizes of arguments for setting tab stops.  Type and
.\"	name are examples of largest possible arguments that will be passed
.\"	to .AP later.  If args are omitted, default tab stops are used.
.\"
.\" .BS
.\"	Start box enclosure.  From here until next .BE, everything will be
.\"	enclosed in one large box.
.\"
.\" .BE
.\"	End of box enclosure.
.\"
.\" .CS
.\"	Begin code excerpt.
.\"
.\" .CE
.\"	End code excerpt.
.\"
.\" .VS ?version? ?br?
.\"	Begin vertical sidebar, for use in marking newly-changed parts
.\"	of man pages.  The first argument is ignored and used for recording
.\"	the version when the .VS was added, so that the sidebars can be
.\"	found and removed when they reach a certain age.  If another argument
.\"	is present, then a line break is forced before starting the sidebar.
.\"
.\" .VE
.\"	End of vertical sidebar.
.\"
.\" .DS
.\"	Begin an indented unfilled display.
.\"
.\" .DE
.\"	End of indented unfilled display.
.\"
.\" .SO ?manpage?
.\"	Start of list of standard options for a Tk widget. The manpage
.\"	argument defines where to look up the standard options; if
.\"	omitted, defaults to "options". The options follow on successive
.\"	lines, in three columns separated by tabs.
.\"
.\" .SE
.\"	End of list of standard options for a Tk widget.
.\"
.\" .OP cmdName dbName dbClass
.\"	Start of description of a specific option.  cmdName gives the
.\"	option's name as specified in the class command, dbName gives
.\"	the option's name in the option database, and dbClass gives
.\"	the option's class in the option database.
.\"
.\" .UL arg1 arg2
.\"	Print arg1 underlined, then print arg2 normally.
.\"
.\" .QW arg1 ?arg2?
.\"	Print arg1 in quotes, then arg2 normally (for trailing punctuation).
.\"
.\" .PQ arg1 ?arg2?
.\"	Print an open parenthesis, arg1 in quotes, then arg2 normally
.\"	(for trailing punctuation) and then a closing parenthesis.
.\"
.\"	# Set up traps and other miscellaneous stuff for Tcl/Tk man pages.
.if t .wh -1.3i ^B
.nr ^l \n(.l
.ad b
.\"	# Start an argument description
.de AP
.ie !"\\$4"" .TP \\$4
.el \{\
.   ie !"\\$2"" .TP \\n()Cu
.   el          .TP 15
.\}
.ta \\n()Au \\n()Bu
.ie !"\\$3"" \{\
\&\\$1 \\fI\\$2\\fP (\\$3)
.\".b
.\}
.el \{\
.br
.ie !"\\$2"" \{\
\&\\$1	\\fI\\$2\\fP
.\}
.el \{\
\&\\fI\\$1\\fP
.\}
.\}
..
.\"	# define tabbing values for .AP
.de AS
.nr )A 10n
.if !"\\$1"" .nr )A \\w'\\$1'u+3n
.nr )B \\n()Au+15n
.\"
.if !"\\$2"" .nr )B \\w'\\$2'u+\\n()Au+3n
.nr )C \\n()Bu+\\w'(in/out)'u+2n
..
.AS Tcl_Interp Tcl_CreateInterp in/out
.\"	# BS - start boxed text
.\"	# ^y = starting y location
.\"	# ^b = 1
.de BS
.br
.mk ^y
.nr ^b 1u
.if n .nf
.if n .ti 0
.if n \l'\\n(.lu\(ul'
.if n .fi
..
.\"	# BE - end boxed text (draw box now)
.de BE
.nf
.ti 0
.mk ^t
.ie n \l'\\n(^lu\(ul'
.el \{\
.\"	Draw four-sided box normally, but don't draw top of
.\"	box if the box started on an earlier page.
.ie !\\n(^b-1 \{\
\h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul'
.\}
.el \}\
\h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul'
.\}
.\}
.fi
.br
.nr ^b 0
..
.\"	# VS - start vertical sidebar
.\"	# ^Y = starting y location
.\"	# ^v = 1 (for troff;  for nroff this doesn't matter)
.de VS
.if !"\\$2"" .br
.mk ^Y
.ie n 'mc \s12\(br\s0
.el .nr ^v 1u
..
.\"	# VE - end of vertical sidebar
.de VE
.ie n 'mc
.el \{\
.ev 2
.nf
.ti 0
.mk ^t
\h'|\\n(^lu+3n'\L'|\\n(^Yu-1v\(bv'\v'\\n(^tu+1v-\\n(^Yu'\h'-|\\n(^lu+3n'
.sp -1
.fi
.ev
.\}
.nr ^v 0
..
.\"	# Special macro to handle page bottom:  finish off current
.\"	# box/sidebar if in box/sidebar mode, then invoked standard
.\"	# page bottom macro.
.de ^B
.ev 2
'ti 0
'nf
.mk ^t
.if \\n(^b \{\
.\"	Draw three-sided box if this is the box's first page,
.\"	draw two sides but no top otherwise.
.ie !\\n(^b-1 \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c
.el \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c
.\}
.if \\n(^v \{\
.nr ^x \\n(^tu+1v-\\n(^Yu
\kx\h'-\\nxu'\h'|\\n(^lu+3n'\ky\L'-\\n(^xu'\v'\\n(^xu'\h'|0u'\c
.\}
.bp
'fi
.ev
.if \\n(^b \{\
.mk ^y
.nr ^b 2
.\}
.if \\n(^v \{\
.mk ^Y
.\}
..
.\"	# DS - begin display
.de DS
.RS
.nf
.sp
..
.\"	# DE - end display
.de DE
.fi
.RE
.sp
..
.\"	# SO - start of list of standard options
.de SO
'ie '\\$1'' .ds So \\fBoptions\\fR
'el .ds So \\fB\\$1\\fR
.SH "STANDARD OPTIONS"
.LP
.nf
.ta 5.5c 11c
.ft B
..
.\"	# SE - end of list of standard options
.de SE
.fi
.ft R
.LP
See the \\*(So manual entry for details on the standard options.
..
.\"	# OP - start of full description for a single option
.de OP
.LP
.nf
.ta 4c
Command-Line Name:	\\fB\\$1\\fR
Database Name:	\\fB\\$2\\fR
Database Class:	\\fB\\$3\\fR
.fi
.IP
..
.\"	# CS - begin code excerpt
.de CS
.RS
.nf
.ta .25i .5i .75i 1i
..
.\"	# CE - end code excerpt
.de CE
.fi
.RE
..
.\"	# UL - underline word
.de UL
\\$1\l'|0\(ul'\\$2
..
.\"	# QW - apply quotation marks to word
.de QW
.ie '\\*(lq'"' ``\\$1''\\$2
.\"" fix emacs highlighting
.el \\*(lq\\$1\\*(rq\\$2
..
.\"	# PQ - apply parens and quotation marks to word
.de PQ
.ie '\\*(lq'"' (``\\$1''\\$2)\\$3
.\"" fix emacs highlighting
.el (\\*(lq\\$1\\*(rq\\$2)\\$3
..
.\"	# QR - quoted range
.de QR
.ie '\\*(lq'"' ``\\$1''\\-``\\$2''\\$3
.\"" fix emacs highlighting
.el \\*(lq\\$1\\*(rq\\-\\*(lq\\$2\\*(rq\\$3
..
.\"	# MT - "empty" string
.de MT
.QW ""
..
.BS
.SH NAME
xsxp \- eXtremely Simple Xml Parser
.SH SYNOPSIS
package require \fBTcl 8\&.5 9\fR
.sp
package require \fBxsxp 1\&.1\fR
.sp
package require \fBxml\fR
.sp
\fBxsxp::parse\fR \fIxml\fR
.sp
\fBxsxp::fetch\fR \fIpxml\fR \fIpath\fR ?\fIpart\fR?
.sp
\fBxsxp::fetchall\fR \fIpxml_list\fR \fIpath\fR ?\fIpart\fR?
.sp
\fBxsxp::only\fR \fIpxml\fR \fItagname\fR
.sp
\fBxsxp::prettyprint\fR \fIpxml\fR ?\fIchan\fR?
.sp
.BE
.SH DESCRIPTION
This package provides a simple interface to parse XML into a pure-value list\&.
It also provides accessor routines to pull out specific subtags,
not unlike DOM access\&.
This package was written for and is used by Darren New's Amazon S3 access package\&.
.PP
This is pretty lame, but I needed something like this for S3,
and at the time, TclDOM would not work with the new 8\&.5 Tcl
due to version number problems\&.
.PP
In addition, this is a pure-value implementation\&. There is no
garbage to clean up in the event of a thrown error, for example\&.
This simplifies the code for sufficiently small XML documents,
which is what Amazon's S3 guarantees\&.
.PP
Copyright 2006 Darren New\&. All Rights Reserved\&.
NO WARRANTIES OF ANY TYPE ARE PROVIDED\&.
COPYING OR USE INDEMNIFIES THE AUTHOR IN ALL WAYS\&.
This software is licensed under essentially the same
terms as Tcl\&. See LICENSE\&.txt for the terms\&.
.SH COMMANDS
The package implements five rather simple procedures\&.
One parses, one is for debugging, and the rest pull various
parts of the parsed document out for processing\&.
.TP
\fBxsxp::parse\fR \fIxml\fR
This parses an XML document (using the standard xml tcllib module in a SAX sort of way) and builds a data structure which it returns if the parsing succeeded\&. The return value is referred to herein as a "pxml", or "parsed xml"\&. The list consists of two or more elements:
.RS
.IP \(bu
The first element is the name of the tag\&.
.IP \(bu
The second element is an array-get formatted list of key/value pairs\&. The keys are attribute names and the values are attribute values\&. This is an empty list if there are no attributes on the tag\&.
.IP \(bu
The third through end elements are the children of the node, if any\&. Each child is, recursively, a pxml\&.
.IP \(bu
Note that if the zero'th element, i\&.e\&. the tag name, is "%PCDATA", then
the attributes will be empty and the third element will be the text of the element\&. In addition, if an element's contents consists only of PCDATA, it will have only one child, and all the PCDATA will be concatenated\&. In other words,
this parser works poorly for XML with elements that contain both child tags and PCDATA\&.  Since Amazon S3 does not do this (and for that matter most
uses of XML where XML is a poor choice don't do this), this is probably
not a serious limitation\&.
.RE
.sp
.TP
\fBxsxp::fetch\fR \fIpxml\fR \fIpath\fR ?\fIpart\fR?
\fIpxml\fR is a parsed XML, as returned from xsxp::parse\&.
\fIpath\fR is a list of element tag names\&. Each element is the name
of a child to look up, optionally followed by a
hash ("#") and a string of digits\&. An empty list or an initial empty element
selects \fIpxml\fR\&. If no hash sign is present, the behavior is as if "#0"
had been appended to that element\&. (In addition to a list, slashes can separate subparts where convenient\&.)
.sp
An element of \fIpath\fR scans the children at the indicated level
for the n'th instance of a child whose tag matches the part of the
element before the hash sign\&. If an element is simply "#"  followed
by digits, that indexed child is selected, regardless of the tags
in the children\&. Hence, an element of "#3" will always select
the fourth child of the node under consideration\&.
.sp
\fIpart\fR defaults to "%ALL"\&. It can be one of the following case-sensitive terms:
.RS
.TP
%ALL
returns the entire selected element\&.
.TP
%TAGNAME
returns lindex 0 of the selected element\&.
.TP
%ATTRIBUTES
returns index 1 of the selected element\&.
.TP
%CHILDREN
returns lrange 2 through end of the selected element,
resulting in a list of elements being returned\&.
.TP
%PCDATA
returns a concatenation of all the bodies of
direct children of this node whose tag is %PCDATA\&.
It throws an error if no such children are found\&. That
is, part=%PCDATA means return the textual content found
in that node but not its children nodes\&.
.TP
%PCDATA?
is like %PCDATA, but returns an empty string if
no PCDATA is found\&.
.RE
.sp
For example, to fetch the first bold text from the fifth paragraph of the body of your HTML file,
.CS

xsxp::fetch $pxml {body p#4 b} %PCDATA
.CE
.sp
.TP
\fBxsxp::fetchall\fR \fIpxml_list\fR \fIpath\fR ?\fIpart\fR?
This iterates over each PXML in \fIpxml_list\fR (which must be a list
of pxmls) selecting the indicated path from it, building a new list
with the selected data, and returning that new list\&.
.sp
For example, \fIpxml_list\fR might be
the %CHILDREN of a particular element, and the \fIpath\fR and \fIpart\fR
might select from each child a sub-element in which we're interested\&.
.sp
.TP
\fBxsxp::only\fR \fIpxml\fR \fItagname\fR
This iterates over the direct children of \fIpxml\fR and selects only
those with \fItagname\fR as their tag\&. Returns a list of matching
elements\&.
.sp
.TP
\fBxsxp::prettyprint\fR \fIpxml\fR ?\fIchan\fR?
This outputs to \fIchan\fR (default stdout) a pretty-printed
version of \fIpxml\fR\&.
.PP
.SH "BUGS, IDEAS, FEEDBACK"
This document, and the package it describes, will undoubtedly contain
bugs and other problems\&.
Please report such in the category \fIamazon-s3\fR of the
\fITcllib Trackers\fR [http://core\&.tcl\&.tk/tcllib/reportlist]\&.
Please also report any ideas for enhancements you may have for either
package and/or documentation\&.
.PP
When proposing code changes, please provide \fIunified diffs\fR,
i\&.e the output of \fBdiff -u\fR\&.
.PP
Note further that \fIattachments\fR are strongly preferred over
inlined patches\&. Attachments can be made by going to the \fBEdit\fR
form of the ticket immediately after its creation, and then using the
left-most button in the secondary navigation bar\&.
.SH KEYWORDS
dom, parser, xml
.SH CATEGORY
Text processing
.SH COPYRIGHT
.nf
2006 Darren New\&. All Rights Reserved\&.

.fi