1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735
|
<appendix id="app-xml">
<?dbhtml filename="appb.html"?>
<appendixinfo>
<pubdate>$Date$</pubdate>
<releaseinfo>$Revision$</releaseinfo>
</appendixinfo>
<title>DocBook and &XML;</title>
<para>
<indexterm id="xmldocbookappb" class='startofrange'><primary>DocBook DTD</primary>
<secondary>XML</secondary></indexterm>
<indexterm id="docbookxmlappa" class="startofrange"><primary>XML</primary>
<secondary>DocBook and</secondary></indexterm>
<indexterm><primary>SGML</primary>
<secondary>XML and</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>SGML, processing</secondary></indexterm>
&XML;, the <ulink url="http://www.w3.org/TR/REC-xml">Extensible
Markup Language</ulink>, is a simple dialect of &SGML;. In the words of the
&XML; specification, “the goal [of &XML;] is to enable generic &SGML; to be
served, received, and processed on the Web in the way that is now possible
with &HTML;.”</para>
<para>&XML; raises two issues with respect to DocBook:<itemizedlist>
<listitem><para>Are DocBook &SGML; instances valid &XML; instances?</para>
</listitem>
<listitem><para>Can the DocBook &DTD; be made into a valid &XML; &DTD;?</para>
</listitem>
</itemizedlist></para>
<para>If you have an existing &SGML; system, and your primary goal is
to serve DocBook documents over the Web as &XML;, only the first of
these issues is relevant. As the popularity of &XML; grows, we will
see more and more &XML;-aware tools that don't implement full
<acronym>ISO</acronym> 8879 &SGML;. If your goal is to author DocBook
documents with one of this new generation of tools, you will only be
able to achieve validity with an &XML; DocBook &DTD;.</para>
<para>
<indexterm><primary>OASIS</primary>
<secondary>XML DocBook version</secondary></indexterm>
Although not yet officially adopted by the <acronym>OASIS</acronym> DocBook Technical
Committee, an &XML; version of DocBook is available now and
provided on the <acronym>CD-ROM</acronym>.
</para>
<sect1>
<title>DocBook Instances as &XML;</title>
<para>
<indexterm><primary>DocBook DTD</primary>
<secondary>instances, converting to XML</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>DocBook instances, converting to</secondary></indexterm>
Most DocBook documents can be made into well-formed &XML; documents very
easily. With few exceptions, valid DocBook &SGML; instances are also well-formed
&XML; instances. The following areas may need to be addressed.</para>
<sect2><title>System Identifiers</title>
<para>
<indexterm><primary>system identifiers</primary>
<secondary>SGML</secondary></indexterm>
<indexterm><primary>public identifiers</primary>
<secondary>SGML</secondary></indexterm>
<indexterm><primary>parameter entities</primary>
<secondary>SGML declarations</secondary></indexterm>
<indexterm><primary>declarations</primary>
<secondary>document type and parameter entity (SGML)</secondary></indexterm>
It is common for &SGML; instances to use only a public identifier in document
type and parameter entity declarations:</para>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<chapter><title>Chapter Title</title>
<para>
This <emphasis>paragraph</paragraph> is important.
</para>
</chapter></programlisting>
<para>&XML; requires a system identifier:
<programlisting>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter><title>Chapter Title</title>
<para>
This <emphasis>paragraph</paragraph> is important.
</para>
</chapter></programlisting></para>
<para>
<indexterm><primary>catalog files</primary>
<secondary>system identifiers, resolving</secondary></indexterm>
<indexterm><primary>URN</primary>
<secondary>XML system identifiers, future</secondary></indexterm>
<indexterm><primary>public identifiers</primary>
<secondary>system identifiers, overriding</secondary></indexterm>
If you're used to using catalog files to resolve system identifiers,
you may be dismayed to learn that system identifiers are required. Because most
tools favor system identifiers over public identifiers, all of the portability
that was gained by the use of catalog files seems to have been lost. In the
long run, it'll be regained by the fact that &XML; system identifiers can be
<acronym>URN</acronym>s, which will have a resolution scheme like catalogs, but what about the
short run?</para>
<para>Luckily, there are a couple of options. First, you can tell your tools to use the public identifiers even
though system identifiers are present. Simply add:</para>
<screen>OVERRIDE YES</screen>
<para>
<indexterm><primary>system identifiers</primary>
<secondary>remapping with SYSTEM catalog directive</secondary></indexterm>
to your catalog files. Alternatively, you can remap system identifers
with the <literal>SYSTEM</literal> catalog directive. If you are faced with
documents that don't use public identifiers at all, this is probably your
only option.
</para>
</sect2>
<sect2><title>Minimization</title>
<para>
<indexterm><primary>markup</primary>
<secondary>minimization</secondary>
<tertiary>SGML/XML conversion problems</tertiary></indexterm>
<indexterm><primary>minimization</primary>
<secondary>markup</secondary>
<tertiary>SGML/XML conversion problems</tertiary></indexterm>
If you have used &SGML; minimization features in your instances:</para>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<chapter id=<co id="xml-attrquote"/>chap1><title>Chapter Title</title>
<para>
This <emphasis>paragraph<co id="xml-endtag"/></> is important.
</para>
</chapter></programlisting>
<para>they will not be well-formed &XML; instances. In particular, &XML;<calloutlist>
<callout arearefs="xml-attrquote"><para>
<indexterm><primary>quotes</primary>
<secondary>attribute values</secondary></indexterm>
<indexterm><primary>attributes</primary>
<secondary>values</secondary>
<tertiary>quoting</tertiary></indexterm>
Requires that all attribute values
be quoted.</para>
</callout>
<callout arearefs="xml-endtag"><para>Does not allow short tag minization.
</para>
</callout></calloutlist>
&XML; also forbids tag omission, and there are
probably a half dozen or so more exotic
examples of minimization that you have used. They're all illegal. The
easiest way to remove these minimizations is probably with a tool like <command>
sgmlnorm</command> (included in the <acronym>SP</acronym> and Jade distributions, on
the <link linkend="app-cdrom"><acronym>CD-ROM</acronym></link>).</para>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter id="chap1"><title>Chapter Title</title>
<para>
This <emphasis>paragraph</emphasis> is important.
</para>
</chapter></programlisting>
</sect2>
<sect2><title>Attribute Default Values</title>
<para>
<indexterm><primary>attributes</primary>
<secondary>default values</secondary></indexterm>
Correct processing of this document may require access to the default
attributes:</para>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<chapter><title>Chapter Title</title>
<para>
Write to us at:
<address<co id="xml-defattr"/>>
90 Sherman Street
Cambridge, MA 02140
</address>
</para>
</chapter></programlisting>
<calloutlist>
<callout arearefs="xml-defattr"><para><sgmltag>Address</sgmltag> expresses
that its content is line-specific with an attribute.</para>
</callout></calloutlist>
<para>Some &XML; processing environments are going to ignore the doctype declaration
in your document, even if it's present. This is relevant when your instance
uses elements that have attributes with default values. The default values
are expressed in the &DTD;, but may not be expressed in your instance. In the
case of DocBook, there are relatively few of these, and your stylesheet can
probably be constructed to do the right thing in either case. (It essentially
treats the attributes as if they had implied values.)</para>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter><title>Chapter Title</title>
<para>
Write to us at:
<address format="linespecific">
90 Sherman Street
Cambridge, MA 02140
</address>
</para>
</chapter></programlisting>
</sect2>
<sect2><title>Character and <literal>SDATA</literal> Entities</title>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<chapter><title>Chapter Title</title>
<para>
This book was published by O'Reilly<co id="xml-sdata"/>&trade;.
</para>
</chapter></programlisting>
<calloutlist>
<callout arearefs="xml-sdata"><para>
<indexterm><primary>characters</primary>
<secondary>entities</secondary></indexterm>
<indexterm><primary>SDATA entities</primary></indexterm>
<indexterm><primary>entities</primary>
<secondary>characters</secondary></indexterm>
<indexterm><primary>entities</primary>
<secondary>SDATA</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>SDATA entities, not allowing</secondary></indexterm>
<indexterm><primary>ISO standards</primary>
<secondary>entity sets</secondary>
<tertiary>SDATA entities, problems with (XML)</tertiary></indexterm>
<indexterm><primary>Unicode character set</primary>
<secondary>ISO standard entity sets and</secondary></indexterm>
The DocBook &DTD; defines all of the standard <acronym>ISO</acronym>
entities automatically, but the <acronym>ISO</acronym> definitions use
<literal>SDATA</literal>, which is not allowed in &XML;. Eventually,
<acronym>ISO</acronym> (or someone else) will release official
<acronym>ISO</acronym> standard entity sets that make reference to the
appropriate Unicode character for each entity. Until then, the &XML;
version of DocBook is
distributed with an unofficial set.</para>
<para>
<indexterm><primary>internal subset</primary>
<secondary>entity declarations</secondary></indexterm>
<indexterm><primary>external subset</primary>
<secondary>entity declarations (SGML/XML conversion)</secondary></indexterm>
If you use entities in your document, it may be wise to put declarations
for them in the internal subset of each instance, because some
&XML; browsers are going to parse the internal subset but not the external subset.
If the entity declarations are in your &DTD;, and the browser does not parse
the external subset, the browser won't know how to display the entities in
your document.</para>
</callout></calloutlist>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
<!ENTITY trade "&#x2122;">
<chapter><title>Chapter Title</title>
<para>
This book was published by O'Reilly&trade;.
</para>
</chapter></programlisting>
</sect2>
<sect2><title>Case-Sensitivity</title>
<programlisting><co id="xml-nmcasekey"/><!DocType Book PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<co id="xml-namecase1"/><book><title>Book Title</title>
<chapter><title>Chapter Title<co id="xml-namecase2"/></Title>
<para>
Paragraph test.
</para>
<co id="xml-wf1"/><PARA>
A second paragraph.
</PARA>
</chapter>
</book></programlisting>
<para>
<indexterm><primary>case sensitivity</primary>
<secondary>DocBook SGML declaration</secondary></indexterm>
<indexterm><primary>elements</primary>
<secondary>case sensitivity (DocBook)</secondary></indexterm>
<indexterm><primary>attributes</primary>
<secondary>case sensitivity (DocBook)</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>case sensitivity</secondary></indexterm>
With the standard DocBook &SGML; declaration, DocBook instances are not
case-sensitive with respect to element and attribute names. &XML; is always
case-sensitive. As long as you have used the same case consistently, your
&XML; instances will be well-formed, but it may still be advantageous to do some
case-folding because it will simplify the construction of stylesheets.</para>
<calloutlist>
<callout arearefs="xml-nmcasekey"><para>Keywords in &XML; are case-sensitive,
and must be in uppercase.
<indexterm><primary>keywords</primary>
<secondary>case sensitivity, XML</secondary></indexterm>
</para>
</callout>
<callout arearefs="xml-namecase1"><para>The name declared in the document
type declaration, like all other names, is case-sensitive.
<indexterm><primary>names</primary>
<secondary>case sensitivity</secondary></indexterm>
</para>
</callout>
<callout arearefs="xml-namecase2"><para>Start and end tags must use the same
case.
<indexterm><primary>start tags</primary>
<secondary>case sensitivity</secondary></indexterm>
<indexterm><primary>end tags</primary>
<secondary>case sensitivity</secondary></indexterm>
</para>
</callout>
<callout arearefs="xml-wf1"><para>In &XML;, <sgmltag>Para</sgmltag> is not the
same as <sgmltag>PARA</sgmltag>. Note that this is a validity error (against
the &XML; version of DocBook), but it is not an &XML; well-formedness error. The use of <sgmltag>
para</sgmltag> and <sgmltag>PARA</sgmltag> as distinct names is as legitimate
as using <sgmltag>foo</sgmltag> and <sgmltag>bar</sgmltag>, as long as they
are properly nested.
<indexterm><primary>Para element</primary>
<secondary>PARA vs. (XML)</secondary></indexterm>
</para>
</callout></calloutlist>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<book><title>Book Title</title>
<chapter><title>Chapter Title</title>
<para>
Paragraph test.
</para>
<para>
A second paragraph.
</para>
</chapter>
</book></programlisting>
</sect2>
<sect2><title>No #CONREF Attributes</title>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
<chapter><title>Chapter Title</title>
<indexterm id="idx-bor"><primary>Something</primary></indexterm><co
id="xml-conrefstart"/>
<para>
Paragraph test.
</para>
<indexterm startref="idx-bor"><co id="xml-conref"/>
</chapter></programlisting>
<para>
<indexterm><primary>#CONREF attributes</primary></indexterm>
<indexterm><primary>Startref attribute</primary></indexterm>
<indexterm><primary>IndexTerm element</primary></indexterm>
<indexterm><primary>OtherTerm attribute</primary></indexterm>
<indexterm><primary>GlossSee element</primary></indexterm>
<indexterm><primary>GlossSeeAlso element</primary></indexterm>
<indexterm><primary>empty tags</primary>
<secondary>#CONREF attributes</secondary></indexterm>
The <sgmltag class="attribute">StartRef</sgmltag> attribute on <sgmltag>
indexterm</sgmltag> and the <sgmltag class="attribute">OtherTerm</sgmltag>
attribute on <sgmltag>GlossSee</sgmltag> and <sgmltag>GlossSeeAlso</sgmltag>
are <literal>#CONREF</literal> attributes.</para>
<para>In &SGML; terms, this means that when these attributes are used, the content
of the tag is taken to be the same as the content of the tag pointed to by
the attribute. <calloutlist>
<callout arearefs="xml-conrefstart xml-conref"><para>If you
have used these attributes, your instance will contain both empty and non-empty
versions of these tags.</para>
</callout></calloutlist></para>
<para>Your best bet is to transform the <literal>#CONREF</literal>
version into an empty tag and let your stylesheet deal with it appropriately.
</para>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter><title>Chapter Title</title>
<indexterm id="idx-bor"><primary>Something</primary></indexterm>
<para>
Paragraph test.
</para>
<indexterm startref="idx-bor"/>
</chapter></programlisting>
</sect2>
<sect2><title>Only Explicit CDATA-Marked Sections Are Allowed</title>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
<!ENTITY % draft "IGNORE">
<!ENTITY % sourcecode "CDATA">
]>
<chapter><title>Chapter Title</title>
<co id="xml-ignore"/><![ %draft; [
<para>
Draft paragraph.
</para>
]]>
<para>
The following code is totally out of context:
<programlisting>
<![ <co id="xml-cdata"/>%sourcecode; [
if (x < 3) {
y = 3;
}
]]>
</programlisting>
</chapter></programlisting>
<calloutlist>
<callout arearefs="xml-ignore xml-cdata"><para>
<indexterm><primary>parameter entities</primary>
<secondary>XML document body</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>parameter entities</secondary></indexterm>
<indexterm><primary>internal subset</primary>
<secondary>parameter entities (XML)</secondary></indexterm>
Parameter entities are not
allowed in the body of &XML; documents (they are allowed in the internal subset).
</para>
</callout>
<callout arearefs="xml-ignore"><para>&XML; instances cannot contain <literal>
IGNORE</literal>, <literal>INCLUDE</literal>, <literal>TEMP</literal>, or <literal>
RCDATA</literal> marked sections.
<indexterm><primary>marked sections</primary>
<secondary>XML, restrictions</secondary></indexterm>
<indexterm><primary>IGNORE keyword (marked section)</primary></indexterm>
<indexterm><primary>INCLUDE keyword (marked section)</primary>
<secondary>XML, not allowing</secondary></indexterm>
<indexterm><primary>TEMP marked section (XML)</primary></indexterm>
<indexterm><primary>RCDATA</primary></indexterm>
</para>
</callout>
<callout arearefs="xml-cdata"><para><literal>CDATA</literal> marked sections
must use the “<literal>CDATA</literal>” keyword literally because
parameter entities are not allowed.
<indexterm><primary>CDATA</primary>
<secondary>marked sections</secondary></indexterm>
</para>
</callout></calloutlist>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter><title>Chapter Title</title>
<para>
The following code is totally out of context:
<programlisting>
<![CDATA[
if (x < 3) {
y = 3;
}
]]>
</programlisting>
</chapter></programlisting>
</sect2>
<sect2><title>No SUBDOC or CDATA External Entities</title>
<programlisting><!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
<!ENTITY % sourcecode SYSTEM "program.c" CDATA>
]>
<chapter><title>Chapter Title</title>
<para>
The following code is totally out of context:
<programlisting>
&sourcecode;
</programlisting>
</chapter></programlisting>
<para>
<indexterm><primary>external general entities</primary>
<secondary>XML restrictions</secondary></indexterm>
<indexterm><primary>XML</primary>
<secondary>external entities, restrictions</secondary></indexterm>
<indexterm><primary>CDATA</primary>
<secondary>XML instances, restrictions</secondary></indexterm>
&XML; instances cannot use <literal>CDATA</literal> or <literal>SUBDOC
</literal> external entities. One option for integrating external <literal>
CDATA</literal> content into a document is to employ a pre-processing pass
that inserts the content inline, wrapped in a <literal>CDATA</literal> marked
section.</para>
<para>
<indexterm><primary>SUBDOC entities</primary></indexterm>
<indexterm><primary>namespaces</primary></indexterm>
<literal>SUBDOC</literal> entities may be more problematic. If you do
not require validation, it may be sufficient to simply put them inline. &XML;
namespaces may offer another possible solution.</para>
<para>The result will be something like this:</para>
<programlisting><?xml version='1.0'?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<chapter><title>Chapter Title</title>
<para>
The following code is totally out of context:
<programlisting>
<![CDATA[
int main () {
..
}
]]>
</programlisting>
</chapter></programlisting>
</sect2>
<sect2><title>No Data Attributes on Notations</title>
<para>They're not allowed in &XML;, so don't add any.
<indexterm><primary>data attributes, notations (XML prohibiting)</primary></indexterm>
</para>
</sect2>
<sect2><title>No Attribute Value Specifications on<?lb?>Entity Declarations</title>
<para>
<indexterm><primary>attributes</primary>
<secondary>values</secondary>
<tertiary>specifying (entity declarations)</tertiary></indexterm>
<indexterm><primary>declarations</primary>
<secondary>entities</secondary>
<tertiary>attribute values, prohibiting (XML)</tertiary></indexterm>
<indexterm><primary>entities</primary>
<secondary>declarations, attribute values (XML)</secondary></indexterm>
They're not allowed in &XML;, so don't add any.</para>
</sect2>
</sect1>
<sect1 id="s-docbookxml">
<title>The DocBook &DTD; as &XML;</title>
<indexterm><primary>DocBook DTD</primary>
<secondary>XML</secondary>
<tertiary>converting to</tertiary></indexterm>
<indexterm><primary>XML</primary>
<secondary>DocBook DTD, converting to</secondary></indexterm>
<para>Converting the DocBook &DTD; to &XML; is much more challenging
than converting the instances. It is probably not possible to
construct an &XML; &DTD; that is identical to the validation power
of DocBook. The list below identifies most of the issues that
must be addressed, and describes how the DocBook &XML; &DTD;; deals with
them:</para>
<variablelist>
<varlistentry><term>Comments are not allowed inside markup declarations</term>
<listitem>
<para>
<indexterm><primary>comments</primary>
<secondary>markup declarations (DocBook XML)</secondary></indexterm>
<indexterm><primary>declarations</primary>
<secondary>comment declarations</secondary></indexterm>
Most of them have been moved to comment declarations preceding the markup
declaration that used to contain them. A few small, inline comments that seemed
like they would be out of context if moved before the declaration were simply
deleted.</para>
</listitem>
</varlistentry>
<varlistentry><term>Name groups are not allowed in element or attribute list
declarations</term>
<listitem>
<para>
<indexterm><primary>name groups (DocBook XML)</primary></indexterm>
<indexterm><primary>elements</primary>
<secondary>declarations</secondary>
<tertiary>name groups, prohibiting</tertiary></indexterm>
<indexterm><primary>attributes</primary>
<secondary>declarations</secondary>
<tertiary>name groups, prohibiting</tertiary></indexterm>
The small number of places in which DocBook uses name groups have
been expanded.</para>
<para>There's one downside: DocBook uses <literal>%admon.class;</literal> in a name
group to define the content model, and attribute lists for elements in the
admonitions class. In DocBook XML, this convenience cannot be expressed. If additional
admonitions are added, the element and attribute list declarations will have
to be copied for them.</para>
</listitem>
</varlistentry>
<varlistentry><term>No <literal>CDATA</literal> or <literal>RCDATA</literal>
declared content</term>
<listitem>
<para>
<indexterm><primary>CDATA</primary>
<secondary>declared content, prohibiting</secondary></indexterm>
<indexterm><primary>RCDATA</primary></indexterm>
<sgmltag>Graphic</sgmltag> and <sgmltag>InlineGraphic</sgmltag> have
been made <literal>EMPTY</literal>. The content model for <sgmltag>SynopFragmentRef
</sgmltag>, the only <literal>RCDATA</literal> element in DocBook, has been
changed to <literal>(arg | group)+</literal>.</para>
</listitem>
</varlistentry>
<varlistentry><term>No exclusions or inclusions on element declarations</term>
<listitem>
<para>
<indexterm><primary>inclusions</primary>
<secondary>element declarations, prohibiting (DocBook XML)</secondary></indexterm>
<indexterm><primary>exclusions</primary>
<secondary>element declarations, prohibiting (DocBook XML)</secondary></indexterm>
They had to be removed.</para>
<para>
<indexterm><primary>exclusions</primary>
<secondary>DocBook, uses</secondary></indexterm>
In DocBook, exclusions are used to exclude the following:<itemizedlist>
<listitem><para>Ubiquitous elements (<sgmltag>indexterm</sgmltag>
and <sgmltag>BeginPage</sgmltag>) from a number of contexts in which they
should not occur (such as metadata, for example).</para>
</listitem>
<listitem><para>
<indexterm><primary>formal objects, exclusions (DocBook)</primary></indexterm>
Formal objects from <sgmltag>Highlights</sgmltag>, <sgmltag>
Example</sgmltag>s, <sgmltag>Figure</sgmltag>s and <sgmltag>LegalNotice</sgmltag>s.
</para>
</listitem>
<listitem><para>
<indexterm><primary>tables</primary>
<secondary>exclusions (DocBook)</secondary></indexterm>
<indexterm><primary>InformalTable element</primary>
<secondary>excluding from tables</secondary></indexterm>
Formal objects and <sgmltag>InformalTable</sgmltag>s
from tables.</para>
</listitem>
<listitem><para>
<indexterm><primary>footnotes, exclusions (DocBook)</primary></indexterm>
<indexterm><primary>block elements</primary>
<secondary>excluding from footnotes</secondary></indexterm>
Block elements and <sgmltag>Footnote</sgmltag>s
from <sgmltag>Footnote</sgmltag>s</para>
</listitem>
<listitem><para>Admonitions, <sgmltag>EntryTbl</sgmltag>s, and <sgmltag>
Acronym</sgmltag>s from themselves.
<indexterm><primary>admonitions</primary>
<secondary>exclusions (DocBook)</secondary></indexterm>
<indexterm><primary>acronyms (DocBook XML)</primary></indexterm>
</para>
</listitem>
</itemizedlist></para>
<para>Removing these exclusions from DocBook &XML; means that it is now valid, in
the &XML; sense, to do some things that don't make a lot of sense (like put
a <sgmltag>Footnote</sgmltag> in a <sgmltag>Footnote</sgmltag>). Be careful.
</para>
<para>
<indexterm><primary>inclusions</primary>
<secondary>DocBook, uses</secondary></indexterm>
<indexterm><primary>IndexTerm element</primary>
<secondary>inclusions, DocBook</secondary></indexterm>
<indexterm><primary>BeginPage element (DocBook inclusions)</primary></indexterm>
<indexterm><primary>parameter entities</primary>
<secondary>DbXML, ubiquitous element inclusions</secondary></indexterm>
<indexterm><primary>#PCDATA keyword</primary>
<secondary>DbXML, ubiquitous elements</secondary></indexterm>
Inclusions in DocBook are used to add the ubiquitious elements (<sgmltag>
indexterm</sgmltag> and <sgmltag>BeginPage</sgmltag>) unconditionally to a
large number of contexts. In order to make these elements available in
DocBook &XML;,
they have been added to most of the parameter entities that include <literal>
#PCDATA</literal>. If new locations are discovered where these terms are desired, DocBook &XML;
will be updated.</para>
</listitem>
</varlistentry>
<varlistentry><term>Elements with mixed content must have <literal>#PCDATA
</literal> first.</term>
<listitem>
<para>
<indexterm><primary>elements</primary>
<secondary>mixed content (DocBook XML)</secondary></indexterm>
<indexterm><primary>content models</primary>
<secondary>elements, updating (DocBook XML)</secondary></indexterm>
The content models of many elements have been updated to make them a
repeatable OR group beginning with <literal>#PCDATA</literal>.</para>
</listitem>
</varlistentry>
<varlistentry><term>Many declared attribute types (<literal>NAME</literal>, <literal>
NUMBER</literal>, <literal>NUTOKEN</literal>, and so on) are not allowed</term>
<listitem>
<para>
<indexterm><primary>attributes</primary>
<secondary>declared types, prohibiting (DocBook XML)</secondary></indexterm>
<indexterm><primary>NMTOKEN(S) attribute</primary>
<secondary>DbXML</secondary></indexterm>
<indexterm><primary>CDATA</primary>
<secondary>DbXML</secondary></indexterm>
They have all been replaced by <literal>NMTOKEN</literal> or <literal>
CDATA</literal>.</para>
</listitem>
</varlistentry>
<varlistentry><term>No <literal>#CONREF</literal> attributes allowed.</term>
<listitem>
<para>
<indexterm><primary>#CONREF attributes</primary>
<secondary>DbXML, prohibiting</secondary></indexterm>
<indexterm><primary>#IMPLIED attribute (DocBook XML)</primary></indexterm>
<indexterm><primary>GlossSee element</primary>
<secondary>DbXML</secondary></indexterm>
<indexterm><primary>GlossSeeAlso element</primary>
<secondary>DbXML</secondary></indexterm>
<indexterm><primary>IndexTerm element</primary>
<secondary>empty (DocBook XML)</secondary></indexterm>
The <literal>#CONREF</literal> attributes on <sgmltag>indexterm</sgmltag>, <sgmltag>
GlossSee</sgmltag>, and <sgmltag>GlossSeeAlso</sgmltag> were changed to <literal>
#IMPLIED</literal>. The content model of <sgmltag>indexterm</sgmltag> was
modified so that it can be empty.</para>
</listitem>
</varlistentry>
<varlistentry><term>Attribute default values must be quoted.</term>
<listitem>
<para>
<indexterm><primary>quotes</primary>
<secondary>attribute values</secondary>
<tertiary>DbXML</tertiary></indexterm>
<indexterm><primary>attributes</primary>
<secondary>values</secondary>
<tertiary>quoting</tertiary></indexterm>
Quotes were added wherever necessary.
<indexterm startref="docbookxmlappa" class="endofrange"/>
<indexterm startref="xmldocbookappb" class="endofrange"/>
</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
</appendix>
<!--
Local Variables:
mode:sgml
sgml-parent-document: ("book.sgm" "appendix")
End:
-->
|