File: retrieving_data.tex

package info (click to toggle)
libnanoxml2-java 2.2.3.dfsg-9
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 988 kB
  • sloc: java: 5,085; xml: 150; makefile: 86; sh: 59
file content (323 lines) | stat: -rw-r--r-- 10,545 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
\chapter{Retrieving Data From An \XML{} Datasource}

This chapter shows how to retrieve \XML{} data from a standard data
source.
Such source can be a file, an \ltext{HTTP} object or a text string.
The method described in this chapter is the simplest way to retrieve
\XML{} data.
More advanced ways are described in the next chapters.

\section{A Very Simple Example}

This section describes a very simple \XML{} application.
It parses \XML{} data from a stream and dumps it ``pretty-printed'' to
the standard output.
While its use is very limited, it shows how to set up a parser and parse an
\XML{} document.

\begin{example}
\xkeyword{import} net.n3.nanoxml.*;\xcallout{1}
\xkeyword{import} java.io.*;

\xkeyword{public class} DumpXML
\{
~~\xkeyword{public static void} main(String[] args)
~~~~\xkeyword{throws} Exception
~~\{
~~~~IXMLParser parser = XMLParserFactory.createDefaultXMLParser();\xcallout{2}
~~~~IXMLReader reader = StdXMLReader.fileReader("test.xml");\xcallout{3}
~~~~parser.setReader(reader);
~~~~IXMLElement xml = (IXMLElement) parser.parse();\xcallout{4}
~~~~XMLWriter writer = new XMLWriter(System.out);\xcallout{5}
~~~~writer.write(xml);
~~\}
\}
\end{example}

\begin{callout}
  \coitem
    The \NanoXML{} classes are located in the package 
    \packagename{net.n3.nanoxml}.
  \coitem
    This command creates an \XML{} parser.
    The actual class of the parser is dependent on the value of the system
    property \propertykey{net.n3.nanoxml.XMLParser}, which is by default
    \propertyvalue{net.n3.nanoxml.StdXMLParser}.
  \coitem
    The command creates a ``standard'' reader which reads its data from the
    file called \filename{test.xml}.

    Usually you can use \classname{StdXMLReader} to feed the \XML{} data to
    the parser.
    The default reader is able to set up HTTP connections when retrieving
    \ltext{DTDs} or entities from different machines.
    If necessary, you can supply your own reader to \acronym{e.g.} provide
    support for \ltext{PUBLIC} identifiers.
  \coitem
    The \XML{} parser now parses the data read from \filename{test.xml}
    and creates a tree of parsed \XML{} elements.

    The structure of those elements will be described in the next section.
  \coitem
    An \classname{XMLWriter} can be used to dump a ``pretty-printed'' view
    of the parsed \XML  data on an output stream.
    In this case, we dump the read data to the standard output
    \ltext{(System.out)}.
\end{callout}


\section{Analyzing The Data}

You can easily traverse the logical tree generated by the parser.
If you need to create your own object tree, you can create your custom
% TODO: make "chapter 3" a xref
builder, which is described in chapter 3.

The default \XML{} builder, \classname{StdXMLBuilder} generates a tree of
\classname{IXMLElement} objects.
Every such object has a name and can have attributes, \ltext{\#PCDATA} content and child objects.

The following XML data:

\begin{example}
$<$FOO attr1="fred" attr2="barney"$>$
~~~~$<$BAR a1="flintstone" a2="rubble"$>$
~~~~~~~~Some data.
~~~~$<$/BAR$>$
~~~~$<$QUUX/$>$
$<$/FOO$>$
\end{example}

is parsed to the following objects:

\begin{itemize}
  \item[] Element FOO:
    \begin{itemize}
      \item[] Attributes = \{ "attr1"="fred", "attr2"="barney" \}
      \item[] Children = \{ BAR, QUUX \}
      \item[] PCData = null
    \end{itemize}
  \item[] Element BAR:
    \begin{itemize}
      \item[] Attributes = \{ "a1"="flintstone", "a2"="rubble" \}
      \item[] Children = \{\}
      \item[] PCData = "Some data."
    \end{itemize}
  \item[] Element QUUX:
    \begin{itemize}
      \item[] Attributes = \{\}
      \item[] Children = \{\}
      \item[] PCData = null
    \end{itemize}
\end{itemize}

You can retrieve the name of an element using \methodname{getFullName}, thus:

\begin{example}
FOO.getFullName() $\to$ "FOO"
\end{example}

You can enumerate the attribute keys using 
\methodname{enumerateAttributeNames}:

\begin{example}
Enumeration enum = FOO.enumerateAttributeNames();
\xkeyword{while} (enum.hasMoreElements()) \{
~~System.out.print(enum.nextElement());
~~System.out.print(' ');
\}
$\to$ attr1 attr2
\end{example}

You can retrieve the value of an attribute using \methodname{getAttribute}:

\begin{example}
FOO.getAttribute ("attr1", null) $\to$ "fred"
\end{example}

The child elements can be enumerated using \methodname{enumerateChildren}:

\begin{example}
Enumeration enum = FOO.enumerateChildren();
\xkeyword{while} (enum.hasMoreElements()) \{
~~System.out.print(enum.nextElement() + ' ');
\}
$\to$ BAR QUUX
\end{example}

If the element contains parsed character data \ltext{(\#PCDATA)} as its only
child.
You can retrieve that data using \methodname{getContent}:

\begin{example}
BAR.getContent() $\to$ "Some data."
\end{example}

If an element contains both \ltext{\#PCDATA} and \XML  elements as its
children, the character data segments will be put in untitled \XML
elements (whose name is \ltext{null}).

\classname{IXMLElement} contains many convenience methods for retrieving data
and traversing the \XML  tree.


\section{Generating \XML}

You can very easily create a tree of \XML  elements or modify an
existing one.

To create a new tree, just create an \classname{IXMLElement} object:

\begin{example}
IXMLElement elt = \xkeyword{new} XMLElement("ElementName");
\end{example}

You can add an attribute to the element by calling \methodname{setAttribute}.

\begin{example}
elt.setAttribute("key", "value");
\end{example}

You can add a child element to an element by calling \methodname{addChild}:

\begin{example}
IXMLElement child = elt.createElement("Child");
elt.addChild(child);
\end{example}

Note that the child element is created calling \methodname{createElement}.
This insures that the child instance is compatible with its new parent.

If an element has no children, you can add \ltext{\#PCDATA} content to it using
\methodname{setContent}:

\begin{example}
child.setContent("Some content");
\end{example}

If the element does have children, you can add \ltext{\#PCDATA} content to it
by adding an untitled element, which you create by calling
\methodname{createPCDataElement}:

\begin{example}
IXMLElement pcdata = elt.createPCDataElement();
pcdata.setContent("Blah blah");
elt.addChild(pcdata);
\end{example}

When you have created or edited the XML element tree, you can write it out to
an output stream or writer using an \classname{XMLWriter}:

\begin{example}
java.io.Writer output = \ldots;
IXMLElement xmltree = \ldots;
XMLWriter xmlwriter = new XMLWriter(output);
writer.write(xmltree);
\end{example}


\section{Namespaces}

As of version 2.1, \ltext{NanoXML} has support for namespaces.
Namespaces allow you to attach a \ltext{URI} to the name of an element name or an attribute.
This \ltext{URI} allows you to make a distinction between similary named
entities coming from different sources.
More information about namespaces can be found in the XML Namespaces
recommendation, which can be found at
\href{http://www.w3c.org/TR/REC-xml-names/}%
{http://www.w3c.org/TR/REC-xml-names/}.

Please note that a \ltext{DTD} has no support for namespaces.
It is important to understand that an \XML  document can have only one
\ltext{DTD}.
Though the namespace \ltext{URI} is often presented as a \ltext{URL}, that
\ltext{URL} is not a system id for a \ltext{DTD}.
The only function of a namespace \ltext{URI} is to provide a globally unique
name.

As an example, lets have the following \XML  data:

\begin{example}
$<$doc:book xmlns:doc="http://nanoxml.n3.net/book"$>$
~~$<$chapter xmlns="http://nanoxml.n3.net/chapter"
~~~~~~~~~~~title="Introduction"
~~~~~~~~~~~doc:id="chapter1"/$>$
$<$/doc:book$>$
\end{example}

The top-level element uses the namespace \ltext{``http://nanoxml.n3.net/book''}.
The prefix is used as an alias for the namespace, which is defined in the
attribute \ltext{xmlns:doc}.
This prefix is defined for the \ltext{doc:book} element and its child elements.

The chapter element uses the namespace
\ltext{``http://nanoxml.n3.net/chapter''}.
Because the namespace \ltext{URI} has been defined as the value of the xmlns
attribute, the namespace is the default namespace for the chapter element.
Default namespaces are inherited by the child elements, but only for their
names.
Attributes never have a default namespace.

The chapter element has an attribute \ltext{doc:id}, which is defined in the
same namespace as \ltext{doc:book} because of the doc prefix.

\ltext{NanoXML 2.1} offers some variants on the standard retrieval methods to allow the application to access the namespace information.

In the following examples, we assume the variable book to contain the
\ltext{doc:book} element and the variable chapter to contain the chapter
element.

To get the full name, which includes the namespace prefix, of the element, use \methodname{getFullName}:

\begin{example}
book.getFullName() $\to$ "doc:book"
chapter.getFullName() $\to$ "chapter"
\end{example}

To get the short name, which excludes the namespace prefix, of the element,
use \methodname{getName}:

\begin{example}
book.getName() $\to$ "book"
chapter.getName $\to$ "chapter"
\end{example}

For elements that have no associated namespace, \methodname{getName} and
\methodname{getFullName} are equivalent.

To get the namespace \ltext{URI} associated with the name of the element, use \methodname{getNamespace}:

\begin{example}
book.getNamespace() $\to$ "http://nanoxml.n3.net/book"
chapter.getNamespace() $\to$ "http://nanoxml.n3.net/chapter"
\end{example}

If no namespace is associated with the name of the element, this method returns \variable{null}.

You can get an attribute of an element using either its full name (which
includes its prefix) or its short name together with its namespace \ltext{URI}, so the following two instructions are equivalent:

\begin{example}
chapter.getAttribute("doc:id", null)
chapter.getAttribute("id", "http://nanoxml.n3.net/book", null)
\end{example}

Note that the title attribute of chapter has no namespace, even though the
chapter element name has a default namespace.

You can create a new element which uses a namespace this way:

\begin{example}
book = \xkeyword{new} XMLElement("doc:book", "http://nanoxml.n3.net/book");
chapter = book.createElement("chapter",
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"http://nanoxml.n3.net/chapter");
\end{example}

You can add an attribute which uses a namespace this way:

\begin{example}
chapter.setAttribute("doc:id",
~~~~~~~~~~~~~~~~~~~~~"http://nanoxml.n3.net/book",
~~~~~~~~~~~~~~~~~~~~~chapterId);
\end{example}