1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628
|
/****************************************************************************
** $Id: qt/xml.doc 3.0.3 edited Oct 12 12:18 $
**
** Documentation on the xml module
**
** Copyright (C) 2000 Trolltech AS. All rights reserved.
**
** This file is part of the Qt GUI Toolkit.
**
** This file may be distributed under the terms of the Q Public License
** as defined by Trolltech AS of Norway and appearing in the file
** LICENSE.QPL included in the packaging of this file.
**
** This file may be distributed and/or modified under the terms of the
** GNU General Public License version 2 as published by the Free Software
** Foundation and appearing in the file LICENSE.GPL included in the
** packaging of this file.
**
** Licensees holding valid Qt Enterprise Edition or Qt Professional Edition
** licenses may use this file in accordance with the Qt Commercial License
** Agreement provided with the Software.
**
** This file is provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE
** WARRANTY OF DESIGN, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
**
** See http://www.trolltech.com/pricing.html or email sales@trolltech.com for
** information about Qt Commercial License Agreements.
** See http://www.trolltech.com/qpl/ for QPL licensing information.
** See http://www.trolltech.com/gpl/ for GPL licensing information.
**
** Contact info@trolltech.com if any conditions of this licensing are
** not clear to you.
**
**********************************************************************/
/*! \page xml.html
\title XML Module
This module is part of the \link editions.html Qt Enterprise Edition \endlink.
\tableofcontents
\target overview
\section1 Overview of the XML architecture in Qt
The XML module provides a well-formed XML parser using the SAX2 (Simple API for
XML) interface plus an implementation of the DOM Level 2 (Document Object
Model).
SAX is an event-based standard interface for XML parsers.
The Qt interface follows the design of the SAX2 Java implementation.
Its naming scheme was adapted to fit the Qt naming conventions.
Details on SAX2 can be found at
\link http://www.megginson.com/SAX/ http://www.megginson.com/SAX/ \endlink.
Support for SAX2 filters and the reader factory are under
development. Furthermore the Qt implementation does not include the
SAX1 compatibility classes present in the Java interface.
For an introduction to Qt's SAX2 classes see
"\link #sax2 The Qt SAX2 classes \endlink".
A code example is discussed in the "\link xml-sax-walkthrough.html tagreader
walkthrough \endlink".
DOM Level 2 is a W3C Recommendation for XML interfaces that maps the
constituents of an XML document to a tree structure. Details and the
specification of DOM Level 2 can be found at
\link http://www.w3.org/DOM/ http://www.w3.org/DOM/ \endlink.
More information about the DOM classes in Qt is provided in the
\link #dom Qt DOM classes \endlink.
Qt provides the following XML related classes:
\list
\i \l QDomAttr -- Represents one attribute of a QDomElement
\i \l QDomCDATASection -- Represents an XML CDATA section
\i \l QDomCharacterData -- Represents a generic string in the DOM
\i \l QDomComment -- Represents an XML comment
\i \l QDomDocument -- The representation of an XML document
\i \l QDomDocumentFragment -- Tree of QDomNodes which is usually not a complete QDomDocument
\i \l QDomDocumentType -- The representation of the DTD in the document tree
\i \l QDomElement -- Represents one element in the DOM tree
\i \l QDomEntity -- Represents an XML entity
\i \l QDomEntityReference -- Represents an XML entity reference
\i \l QDomImplementation -- Information about the features of the DOM implementation
\i \l QDomNamedNodeMap -- Collection of nodes that can be accessed by name
\i \l QDomNode -- The base class for all nodes of the DOM tree
\i \l QDomNodeList -- List of QDomNode objects
\i \l QDomNotation -- Represents an XML notation
\i \l QDomProcessingInstruction -- Represents an XML processing instruction
\i \l QDomText -- Represents textual data in the parsed XML document
\i \l QXmlAttributes -- XML attributes
\i \l QXmlContentHandler -- Interface to report logical content of XML data
\i \l QXmlDeclHandler -- Interface to report declaration content of XML data
\i \l QXmlDefaultHandler -- Default implementation of all XML handler classes
\i \l QXmlDTDHandler -- Interface to report DTD content of XML data
\i \l QXmlEntityResolver -- Interface to resolve extern entities contained in XML data
\i \l QXmlErrorHandler -- Interface to report errors in XML data
\i \l QXmlInputSource -- The input data for the QXmlReader subclasses
\i \l QXmlLexicalHandler -- Interface to report lexical content of XML data
\i \l QXmlLocator -- The XML handler classes with information about the actual parsing position
\i \l QXmlNamespaceSupport -- Helper class for XML readers which want to include namespace support
\i \l QXmlParseException -- Used to report errors with the QXmlErrorHandler interface
\i \l QXmlReader -- Interface for XML readers (i.e. for SAX2 parsers)
\i \l QXmlSimpleReader -- Implementation of a simple XML reader (a SAX2 parser)
\endlist
\target sax2
\section1 The Qt SAX2 classes
\target sax2Intro
\section2 Introduction to SAX2
The SAX2 interface is an event-driven mechanism to provide the user with
document information. "Event" in this context has nothing to do with the
term "event" you probably know from windowing systems; it means that the
parser reports certain document information while parsing the document.
These reported information is referred to as "event".
To make it less abstract consider the following example:
\code
<quote>To make it less abstract consider the following example:</quote>
\endcode
Whilst reading (a SAX2 parser is usually referred to as "reader")
the above document three events would be triggered:
\list 1
\i A start tag occurs (\c{<quote>}).
\i Character data (i.e. text) is found.
\i An end tag is parsed (\c{</quote>}).
\endlist
Each time such an event occurs the parser reports it so that
a suitable event handling routine can be invoked.
Whilst this is a fast and simple approach to read XML documents
manipulation is difficult because data are not stored, simply handled
and discarded serially. This is when the \link #dom DOM interface
\endlink comes handy.
The Qt XML module provides an
abstract class, \l QXmlReader, that defines the interface for potential
SAX2 readers.
At the moment Qt ships with one reader implementation, \l
QXmlSimpleReader.
The reader reports parsing events through special handler classes. In Qt
the following ones are available:
\list
\i \l QXmlContentHandler
reports events related to the content of a document (e.g. the start tag
or characters).
\i \l QXmlDTDHandler
reports events related to the DTD (e.g. notation declarations).
\i \l QXmlErrorHandler
reports errors or warnings that occurred during parsing.
\i \l QXmlEntityResolver
reports external entities during parsing and allows the user to resolve
external entities him- or herself instead of leaving it to the reader.
\i \l QXmlDeclHandler
reports further DTD related events (e.g. attribute declarations).
Usually users are not interested in them, but under certain circumstances
this class comes handy.
\i \l QXmlLexicalHandler
reports events related to the lexical structure of the document
(the beginning of the DTD, comments etc.). Occasionally this
might be useful.
\endlist
These classes are abstract classes describing the interface. The
\l QXmlDefaultHandler class provides a "do nothing" default implementation for
all of them. Therefore users need to overload only the
QXmlDefaultHandler functions they are interested in.
To read input XML data a special class \l QXmlInputSource is used.
Apart from the already mentioned ones the following SAX2 support classes
provide the user with useful functionality:
\list
\i \l QXmlAttributes
is used to pass attributes in a start element event.
\i \l QXmlLocator
is used to obtain the actual parsing position of an event.
\i \l QXmlNamespaceSupport
is used to easily implement \link xml.html#namespaces namespace \endlink
support for a reader.
Note that namespaces do not change the parsing
behavior. They are only reported through the handler.
\endlist
\target sax2Features
\section2 Features
The behaviour of an XML reader depends on whether it supports certain
optional features or not.
As an example a reader can have the feature
"report attributes used for \link xml.html#namespaces namespace \endlink declarations
and prefixes along with the local name of a tag".
Like every other feature this has a unique name represented by a URI:
it is called \e http://xml.org/sax/features/namespace-prefixes.
The Qt SAX2 implementation allows you to find out whether the
reader has this ability using \l QXmlReader::hasFeature().
If the return value is TRUE it is possible to
turn the relevant feature on and off.
To do this use \l QXmlReader::setFeature(). Whether a supported feature
is on or off (TRUE or FALSE) can be queried using \l QXmlReader::feature().
Consider the example
\code
<document xmlns:book = 'http://trolltech.com/fnord/book/'
xmlns = 'http://trolltech.com/fnord/' >
\endcode
A reader not supporting the
\e http://xml.org/sax/features/namespace-prefixes feature would clearly
report the element name \e document but not its attributes \e xmlns:book and \e xmlns
with their values. A
reader with the feature \e http://xml.org/sax/features/namespace-prefixes
reports the namespace attributes if \l QXmlReader::feature() is TRUE and
disregards them if the feature is FALSE.
Other features include \e http://xml.org/sax/features/namespace (namespace
processing, implies \e http://xml.org/sax/features/namespace-prefixes) or
\e http://xml.org/sax/features/validation (the ability to report validation
errors).
Whilst SAX2 leaves it to the user to define and implement whatever
features are required, support for \e http://xml.org/sax/features/namespace
(and thus \e http://xml.org/sax/features/namespace-prefixes) is
mandantory. Accordingly \l QXmlSimpleReader, the implementation
of \l QXmlReader that comes with the Qt XML module, supports both of them,
and therefore can do namespace processing.
Being a non-validating parser \l QXmlSimpleReader
does not support \e http://xml.org/sax/features/validation
and other features.
\target sax2Namespaces
\section2 Namespace support via features
As we have seen in the \link #sax2Features previous section \endlink
we can configure the behavior of the reader when it comes to namespace
processing. This is done by setting and unsetting the
\e http://xml.org/sax/features/namespaces and
\e http://xml.org/sax/features/namespace-prefixes features.
They influence the reporting behavior in the following way:
\list 1
\i Namespace prefixes and local parts of elements and attributes can be
reported.
\i The qualified names of elements and attributes are reported.
\i \l QXmlContentHandler::startPrefixMapping() and \l
QXmlContentHandler::endPrefixMapping() are called by the reader.
\i Attributes that declare namespaces (i.e. the attribute \e xmlns and
attributes starting with \e xmlns: ) are reported.
\endlist
Consider the following element:
\code
<author xmlns:fnord = 'http://trolltech.com/fnord/'
title="Ms"
fnord:title="Goddess"
name="Eris Kallisti"/>
\endcode
With \e http://xml.org/sax/features/namespace-prefixes set to TRUE
the reader will report four attributes, with the \e namespace-prefixes
feature set to FALSE only three: The \e xmlns:fnord attribute defining
a namespace is then "unvisible" for the reader.
The \e http://xml.org/sax/features/namespaces feature on the other hand
is responsible for reporting local names, namespace prefixes and -URIs.
With \e http://xml.org/sax/features/namespaces set to TRUE
the parser will report \e title as the local name of \e fnord:title
attribute, \e fnord being the namespace prefix and \e http://trolltech.com/fnord/
as the namespace URI.
When \e http://xml.org/sax/features/namespaces is FALSE none of them are
reported.
In the current implementation the Qt XML classes follow the definition
that the prefix \e xmlns itself isn't associated with any namespace at all
(see \link http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-using
http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-using \endlink).
Therefore even with \e http://xml.org/sax/features/namespaces and
\e http://xml.org/sax/features/namespace-prefixes both set to TRUE
the reader won't return either a local name, a namespace prefix or
a namespace URI for \e xmlns:fnord.
This might be changed in the future following the W3C suggestion
\link http://www.w3.org/2000/xmlns/ http://www.w3.org/2000/xmlns/ \endlink
to associate \e xmlns with the namespace \e http://www.w3.org/2000/xmlns.
As the SAX2 standard suggests \l QXmlSimpleReader by default has
\e http://xml.org/sax/features/namespaces set to TRUE and
\e http://xml.org/sax/features/namespace-prefixes set to FALSE.
When changing this behavior using \l QXmlSimpleReader::setFeature()
note that the combination of both features set to
FALSE is illegal.
For a practical demonstration of how the two features affect the
output of the reader run the \link tagreader-with-features-example.html
tagreader with features example. \endlink
\target sax2NamespacesSummary
\section3 Summary
\l QXmlSimpleReader implements the following behavior:
\table
\header \i (namespaces, namespace-prefixes)
\i Namespace prefix and local part
\i Qualified names
\i Prefix mapping
\i xmlns attributes
\row \i (TRUE, FALSE) \i Yes \i Yes* \i Yes \i No
\row \i (TRUE, TRUE) \i Yes \i Yes \i Yes \i Yes
\row \i (FALSE, TRUE) \i No* \i Yes \i No* \i Yes
\row \i (FALSE, FALSE) \i41 Illegal
\endtable
For the entries marked with a "*", SAX does not require a particuliar
behavior.
\target sax2Properties
\section2 Properties
Properties are a more general concept. They also have a unique name,
represented as an URI, but their value is \c void*. Thus nearly everything
can be used as a property value. This concept involves some danger,
though: there are no means to ensure type-safety; the user must take care
that he or she passes the correct type. Properties are useful if a reader supports
special handler classes.
\omit example! \endomit
The URIs used for features and properties often look like URLs, e.g.
\c http://xml.org/sax/features/namespace. This does not mean that whatsoever
data is required at this address. It is simply a way to define unique names.
Everybody can define and use new SAX2 properties for his or her
readers. Property support is however not
required.
To set or query properties the following functions are provided:
\l QXmlReader::setProperty(), \l QXmlReader::property() and \l
QXmlReader::hasProperty().
\target sax2Reading
\section2 Further reading
For a practical example on how to use the Qt SAX2 classes see the
\link xml-sax-walkthrough.html tagreader walkthrough. \endlink
More information about XML (e.g. \link xml.html#namespaces namespaces \endlink)
can be found in the \link xml.html introduction to the Qt XML module. \endlink
\target dom
\section1 The Qt DOM classes
\target domIntro
\section2 Introduction to DOM
DOM provides an interface to access and change the content and structure of
an XML file. It makes a hierarchical view of the document (tree)
available with the root element of the XML file serving as its root.
Thus -- in contrast to the SAX2 interface -- an object model of the document
is resident in memory after parsing which makes manipulation easy.
In the Qt implementation of the DOM all
nodes in the document tree are subclasses of \l QDomNode.
The document itself is represented as a \l QDomDocument object.
Here are the available node classes and their potential children classes:
\list
\i \l QDomDocument: Possible children are
\list
\i \l QDomElement (at most one)
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomDocumentType
\endlist
\i \l QDomDocumentFragment: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomDocumentType: No children
\i \l QDomEntityReference: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomElement: Possible children are
\list
\i \l QDomElement
\i \l QDomText
\i \l QDomComment
\i \l QDomProcessingInstruction
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomAttr: Possible children are
\list
\i \l QDomText
\i \l QDomEntityReference
\endlist
\i \l QDomProcessingInstruction: No children
\i \l QDomComment: No children
\i \l QDomText: No children
\i \l QDomCDATASection: No children
\i \l QDomEntity: Possible children are
\list
\i \l QDomElement
\i \l QDomProcessingInstruction
\i \l QDomComment
\i \l QDomText
\i \l QDomCDATASection
\i \l QDomEntityReference
\endlist
\i \l QDomNotation: No children
\endlist
With \l QDomNodeList and \l QDomNamedNodeMap two collection classes
are provided: \l QDomNodeList is a list of nodes
whereas \l QDomNamedNodeMap is used to handle unordered sets of nodes
(often used for attributes).
The \l QDomImplementation class allows the user to query features of the
DOM implementation.
\section2 Further reading
To get started please refer to the \l QDomDocument documentation that
describes basic usage.
More information about Qt and XML can be found in the \link xml.html
Introduction to the Qt XML module. \endlink
\target namespaces
\section1 An introduction to namespaces
Parts of the Qt XML module documentation assume that you are
familiar with XML namespaces. Here we present a brief introduction;
skip to \link #namespacesConventions Qt XML
documentation conventions \endlink if you know this material.
Namespaces are a concept introduced into XML to allow a more modular design.
With their help data processing software can easily
resolve naming conflicts in XML documents.
Consider the following example:
\code
<document>
<book>
<title>Practical XML</title>
<author title="Ms" name="Eris Kallisti"/>
<chapter>
<title>A Namespace Called fnord</title>
</chapter>
</book>
</document>
\endcode
Here we find three different uses of the name \e title. If you wish
to process this document you will encounter problems
because each of the \e titles should be displayed in a different manner --
even though they have the same name.
The solution would be to have some means of identifying the
first occurrence of \e title as the title of a book, i.e.
to use the \e title element of a
book namespace to distinguish it from for example the chapter title, e.g.:
\code
<book:title>Practical XML</book:title>
\endcode
\e book in this case is
a \e prefix denoting the namespace.
Before we can apply a
namespace to element or attribute names we must declare it.
Namespaces are URIs like \e http://trolltech.com/fnord/book/.
This does not mean that data must be available at this
address; the URI is simply used to provide a unique name.
We declare namespaces in the same way as
attributes; strictly speaking they \e are attributes.
To make for example \e http://trolltech.com/fnord/ the document's
default XML namespace \e xmlns we write
\code
xmlns="http://trolltech.com/fnord/"
\endcode
To distinguish the \e http://trolltech.com/fnord/book/ namespace
from the default, we have to supply it with a prefix:
\code
xmlns:book="http://trolltech.com/fnord/book/"
\endcode
A namespace that is declared like this can be applied
to element and attribute names by prepending the appropriate
prefix and a ":" delimiter. We have already seen this with
the \e book:title element.
Element names without a prefix belong to the default namespace.
This rule does not apply to attributes: an attribute
without a prefix does not belong to any of the declared
XML namespaces at all.
Attributes always belong to the "traditional" namespace
of the element in which they appear. A "traditional" namespace
is not an XML namespace, it simply means that all attribute names
belonging to one element must be different. Later we will see how
to assign an XML namespace to an attribute.
Due to the fact that attributes without prefixes are not in any
XML namespace there is
no collision between the attribute \e title (that belongs to the
\e author element) and for example the \e title element within a \e chapter.
Let's clarify matters with an example:
\code
<document xmlns:book = 'http://trolltech.com/fnord/book/'
xmlns = 'http://trolltech.com/fnord/' >
<book>
<book:title>Practical XML</book:title>
<book:author xmlns:fnord = 'http://trolltech.com/fnord/'
title="Ms"
fnord:title="Goddess"
name="Eris Kallisti"/>
<chapter>
<title>A Namespace Called fnord</title>
</chapter>
</book>
</document>
\endcode
Within the \e document element we have two namespaces declared.
The default namespace \e http://trolltech.com/fnord/
applies to the \e book element, the \e chapter element,
the appropriate \e title element and of course to \e document itself.
The \e book:author and \e book:title elements
belong to the namespace with the
URI \e http://trolltech.com/fnord/book/.
The two \e book:author attributes \e title and \e name have no XML namespace
assigned.
They are only members of the "traditional" namespace of the element
\e book:author, meaning that for example two \e title attributes
in \e book:author are forbidden.
In the above example we circumvent the last rule by adding a \e title
attribute from the \e http://trolltech.com/fnord/ namespace to \e book:author:
the \e fnord:title comes from the namespace with the prefix \e fnord
that is declared in the \e book:author element.
Clearly the \e fnord namespace has the same namespace URI as the
default namespace. So why didn't we simply use the
default namespace we'd already declared? The answer is quite complex:
\list
\i attributes without a prefix don't belong to any XML namespace at all,
even not to the default namespace;
\i additionally omitting the prefix would lead to a \e title-title clash;
\i writing it as \e xmlns:title would declare a new namespace with
the prefix \e title instead of applying the default \e xmlns namespace.
\endlist
With the Qt XML classes elements and attributes can be accessed in two ways: either
by refering to their qualified names consisting of the namespace prefix
and the "real" name (or \e local name) or
by the combination of local name and namespace URI.
More information on XML namespaces can be found at
\l http://www.w3.org/TR/REC-xml-names/.
\target namespacesConventions
\section2 Conventions used in Qt XML documentation
The following terms are used to distinguish the parts of names within the context of
namespaces:
\list
\i The \e {qualified name}
is the name as it appears in the document. (In the above example \e
book:title is a qualified name.)
\i A \e {namespace prefix} in a qualified name
is the part to the left of the ":". (\e book is the namespace prefix in
\e book:title.)
\i The \e {local part} of a name (also refered to as the \e {local name}) appears
to the right of the ":".
(Thus \e title is the local part of \e book:title.)
\i The \e {namespace URI} ("Uniform Resource Identifier") is a unique
identifier for a namespace. It looks like a URL
(e.g. \e http://trolltech.com/fnord/ ) but does not require
data to be accessible by the given protocol at the named address.
\endlist
Elements without a ":" (like \e chapter in the example) do not have a
namespace prefix. In this case the local part and the qualified name
are identical (i.e. \e chapter).
*/
|