1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358
|
<Chapter Label="ch:intro"><Heading>Introduction and Example</Heading>
The main purpose of the &GAPDoc; package is to define a file format for
documentation of &GAP;-programs and -packages (see <Cite Key="GAP4" />). The
problem is that such documentation should be readable in several output
formats. For example it should be possible to read the documentation inside
the terminal in which &GAP; is running (a text mode) and there should be a
printable version in high typesetting quality (produced by some version of
&TeX;). It is also popular to view &GAP;'s online help with a Web-browser
via an HTML-version of the documentation. Nowadays one can use &LaTeX; and
standard viewer programs to produce and view on the screen <C>dvi</C>- or
<C>pdf</C>-files with full support of internal and external hyperlinks.
Certainly there will be other interesting document formats and tools in this
direction in the future. <P/>
Our aim is to find a <Emph>format for writing</Emph> the documentation which
allows a relatively easy translation into the output formats just mentioned
and which hopefully makes it easy to translate to future output formats as
well. <P/>
To make documentation written in the &GAPDoc; format directly usable, we
also provide a set of programs, called converters, which produce text-,
hyperlinked &LaTeX;- and HTML-output versions of a &GAPDoc; document. These
programs are developed by the first named author. They run completely inside
&GAP;, i.e., no external programs are needed. You only need <C>latex</C> and
<C>pdflatex</C> to process the &LaTeX; output. These programs are described
in Chapter <Ref Chap="ch:conv"/>.
<Section Label="sec:XML"><Heading>XML</Heading>
<Index >XML</Index>
The definition of the &GAPDoc; format uses XML, the <Q>eXtendible Markup
Language</Q>. This is a standard (defined by the W3C consortium, see
<URL>https://www.w3c.org</URL>) which lays down a syntax for adding markup to
a document or to some data. It allows to define document structures via
introducing markup <E>elements</E> and certain relations between them. This
is done in a <E>document type definition</E>. The file <F>gapdoc.dtd</F>
contains such a document type definition and is the central part of the
&GAPDoc; package. <P/>
The easiest way for getting a good idea about this is probably to look at an
example. The Appendix <Ref Appendix="app:3k+1" /> contains a short but
complete &GAPDoc; document for a fictitious share package. In the next
section we will go through this document, explain basic facts about XML and
the &GAPDoc; document type, and give pointers to more details in later parts
of this documentation. <P/>
In the last Section <Ref Sect="sec:faq" /> of this introductory chapter
we try to answer some general questions about the decisions which lead to
the &GAPDoc; package.
</Section>
<Section Label="sec:3k+1expl"><Heading>A complete example</Heading>
In this section we recall the lines from the example document in
Appendix <Ref Appendix="app:3k+1" /> and give some explanations.
<Listing Type="from 3k+1.xml">
<![CDATA[<?xml version="1.0" encoding="UTF-8"?> ]]>
</Listing>
This line just tells a human reader and computer programs that the file
is a document with XML markup and that the text is encoded in the UTF-8
character set (other common encodings are ASCII or ISO-8895-X encodings).
<Listing Type="from 3k+1.xml">
<![CDATA[<!-- A complete "fake package" documentation
-->
]]></Listing>
Everything in a XML file between <Q><C><!--</C></Q> and
<Q><C>--></C></Q> is a comment and not part of the document content.
<Listing Type="from 3k+1.xml">
<![CDATA[<!DOCTYPE Book SYSTEM "gapdoc.dtd">
]]></Listing>
This line says that the document contains markup which is defined in
the system file <F>gapdoc.dtd</F> and that the markup obeys certain
rules defined in that file (the ending <F>dtd</F> means <Q>document type
definition</Q>). It further says that the actual content of the document
consists of an element with name <Q>Book</Q>. And we can really see that the
remaining part of the file is enclosed as follows:
<Listing Type="from 3k+1.xml">
<![CDATA[<Book Name="3k+1">
[...] (content omitted)
</Book>
]]></Listing>
This demonstrates the basics of the markup in XML. This part of the document
is an <Q>element</Q>. It consists of the <Q>start tag</Q> <C><![CDATA[<Book
Name="3k+1">]]></C>, the <Q>element content</Q> and the <Q>end tag</Q>
<C><![CDATA[</Book>]]></C> (end tags always start with <C></</C>). This
element also has an <Q>attribute</Q> <C>Name</C> whose <Q>value</Q> is
<C>3k+1</C>.
<P/>
If you know HTML, this will look familiar to you. But there are some
important differences: The element name <C>Book</C> and attribute name
<C>Name</C> are <E>case sensitive</E>. The value of an attribute must
<E>always</E> be enclosed in quotes. In XML <E>every</E> element has a start
and end tag (which can be combined for elements defined as <Q>empty</Q>, see
for example <C><TableOfContents/></C> below).
<P/>
If you know &LaTeX;, you are familiar with quite different
types of markup, for example: The equivalent of the <C>Book</C>
element in &LaTeX; is <C>\begin{document} ...
\end{document}</C>. The sectioning in &LaTeX; is not
done by explicit start and end markup, but implicitly via heading
commands like <C>\section</C>. Other markup is done by using
braces <C>{}</C> and putting some commands inside. And for
mathematical formulae one can use the <C>$</C> for the start
<E>and</E> the end of the markup. In XML <E>all</E> markup looks similar to
that of the <C>Book</C> element. <P/>
The content of the book starts with a title page.
<Listing Type="from 3k+1.xml">
<![CDATA[<TitlePage>
<Title>The <Package>ThreeKPlusOne</Package> Package</Title>
<Version>Version 42</Version>
<Author>Dummy Authör
<Email>3kplusone@dev.null</Email>
</Author>
<Copyright>©right; 2000 The Author. <P/>
You can do with this package what you want.<P/> Really.
</Copyright>
</TitlePage>
]]></Listing>
The content of the <C>TitlePage</C> element consists again of elements. In
Chapter <Ref Chap="DTD" /> we describe which elements are allowed
within a <C>TitlePage</C> and that their ordering is prescribed in this
case. In the (stupid) name of the author you see that a German umlaut is
used directly (in ISO-latin1 encoding).
<P/>
Contrary to &LaTeX;- or HTML-files this markup does not say anything about
the actual layout of the title page in any output version of the document.
It just adds information about the <E>meaning</E> of pieces of text. <P/>
Within the <C>Copyright</C> element there are two more things to learn about
XML markup. The <C><P/></C> is a complete element. It is a combined
start and end tag. This shortcut is allowed for elements which are defined
to be always <Q>empty</Q>, i.e., to have no content. You may have already
guessed that <C><P/></C> is used as a paragraph separator. Note that
empty lines do not separate paragraphs (contrary to &LaTeX;). <P/>
The other construct we see here is <C>&copyright;</C>. This is an
example of an <Q>entity</Q> in XML and is a macro for some substitution
text. Here we use an entity as a shortcut for a complicated expression which
makes it possible that the term <E>copyright</E> is printed as some text
like <C>(C)</C> in text terminal output and as a copyright character in
other output formats. In &GAPDoc; we predefine some entities.
Certain <Q>special characters</Q> must be typed via entities, for example
<Q><</Q>, <Q>></Q> and <Q>&</Q> to avoid a misinterpretation as
XML markup. It is possible to define
additional entities for your document inside the <C><!DOCTYPE ...></C>
declaration, see <Ref Subsect="GDent" />. <P/>
Note that elements in XML must always be properly nested, as in this
example. A construct like <C><![CDATA[<a><b>...</a></b>]]></C> is <E>not</E>
allowed.
<Listing Type="from 3k+1.xml">
<![CDATA[<TableOfContents/>
]]></Listing>
This is another example of an <Q>empty element</Q>. It just means that a
table of contents for the whole document should be included into any output
version of the document.
<P/>
After this the main text of the document follows inside certain sectioning
elements:
<Listing Type="from 3k+1.xml">
<![CDATA[<Body>
<Chapter> <Heading>The <M>3k+1</M> Problem</Heading>
<Section Label="sec:theory"> <Heading>Theory</Heading>
[...] (content omitted)
</Section>
<Section> <Heading>Program</Heading>
[...] (content omitted)
</Section>
</Chapter>
</Body>
]]></Listing>
These elements are used similarly to <Q>\chapter</Q> and
<Q>\section</Q> in &LaTeX;. But note that the explicit end tags are
necessary here.
<P/>
The sectioning commands allow to assign an optional attribute <Q>Label</Q>.
This can be used for referring to a section inside the document.
<P/>
The text of the first section starts as follows. The whitespace in the text
is unimportant and the indenting is not necessary.
<Listing Type="from 3k+1.xml">
<![CDATA[ Let <M>k \in &NN;</M> be a natural number. We consider the
sequence <M>n(i, k), i \in &NN;,</M> with <M>n(1, k) = k</M> and
else
]]></Listing>
Here we come to the interesting question how to type mathematical formulae
in a &GAPDoc; document. We did not find any alternative for writing formulae
in &TeX; syntax. (There is MATHML, but even simple formulae contain a lot
of markup, become quite unreadable and they are cumbersome to type.
Furthermore there seem to be no tools available which translate such
formulae in a nice way into &TeX; and text.) So, formulae are essentially
typed as in &LaTeX;. (Actually, it is also possible to type unicode
characters of some mathematical symbols directly, or via an entity like the
<C>&NN;</C> above.) There are three types of elements containing
formulae: <Q>M</Q>, <Q>Math</Q> and <Q>Display</Q>. The first two are for
in-text formulae and the third is for displayed formulae. Here <Q>M</Q> and
<Q>Math</Q> are equivalent, when translating a &GAPDoc; document into
&LaTeX;. But they are handled differently for terminal text (and HTML)
output. For the content of an <Q>M</Q>-element there are defined rules for a
translation into well readable terminal text. More complicated formulae are
in <Q>Math</Q> or <Q>Display</Q> elements and they are just printed as they
are typed in text output. So, to make a section well readable inside a
terminal window you should try to put as many formulae as possible into
<Q>M</Q>-elements. In our example text we used the notation <C>n(i, k)</C>
instead of <C>n_i(k)</C> because it is easier to read in text mode. See
Sections <Ref Sect="GDformulae"/> and <Ref Sect="sec:misc" /> for
more details. <P/>
A few lines further on we find two non-internal references.
<Listing Type="from 3k+1.xml">
<![CDATA[ problem, see <Cite Key="Wi98"/> or
<URL>http://mathsrv.ku-eichstaett.de/MGF/homes/wirsching/</URL>
]]></Listing>
The first within the <Q>Cite</Q>-element is the citation of a book. In
&GAPDoc; we use the widely used &BibTeX; database format for reference
lists. This does not use XML but has a well documented structure which is
easy to parse. And many people have collections of references readily
available in this format. The reference list in an output version of the
document is produced with the empty element
<Listing Type="from 3k+1.xml">
<![CDATA[<Bibliography Databases="3k+1" />
]]></Listing>
close to the end of our example file. The attribute <Q>Databases</Q>
give the name(s) of the database (<F>.bib</F>) files which contain the
references.
<P/>
Putting a Web-address into an <Q>URL</Q>-element allows one to create a
hyperlink in output formats which allow this.
<P/>
The second section of our example contains a special kind of subsection
defined in &GAPDoc;.
<Listing Type="from 3k+1.xml">
<![CDATA[ <ManSection>
<Func Name="ThreeKPlusOneSequence" Arg="k[, max]"/>
<Description>
This function computes for a natural number <A>k</A> the
beginning of the sequence <M>n(i, k)</M> defined in section
<Ref Sect="sec:theory"/>. The sequence stops at the first
<M>1</M> or at <M>n(<A>max</A>, k)</M>, if <A>max</A> is
given.
<Example>
gap> ThreeKPlusOneSequence(101);
"Sorry, not yet implemented. Wait for Version 84 of the package"
</Example>
</Description>
</ManSection>
]]></Listing>
A <Q>ManSection</Q> contains the description of some function, operation,
method, filter and so on. The <Q>Func</Q>-element describes the name of a
<E>function</E> (there are also similar elements <Q>Oper</Q>, <Q>Meth</Q>,
<Q>Filt</Q> and so on) and names for its arguments, optional arguments
enclosed in square brackets. See Section <Ref Sect="sec:mansect" /> for
more details. <P/>
In the <Q>Description</Q> we write the argument names as <Q>A</Q>-elements.
A good description of a function should usually contain an example of its
use. For this there are some verbatim-like elements in &GAPDoc;, like
<Q>Example</Q> above (here, clearly, whitespace matters which causes a
slightly strange indenting). <P/>
The text contains an internal reference to the first section via the
explicitly defined label <C>sec:theory</C>.
<P/>
The first section also contains a <Q>Ref</Q>-element which refers to the
function described here. Note that there is no explicit label for such a
reference. The pair <C><![CDATA[<Func Name="ThreeKPlusOneSequence" Arg="k[,
max]"/>]]></C> and <C><![CDATA[<Ref Func="ThreeKPlusOneSequence"/>]]></C>
does the cross referencing (and hyperlinking if possible) implicitly via the
name of the function.
<P/>
Here is one further element from our example document which we want to
explain.
<Listing Type="from 3k+1.xml">
<![CDATA[<TheIndex/>
]]></Listing>
This is again an empty element which just says that an output version of the
document should contain an index. Many entries for the index are generated
automatically because the <Q>Func</Q> and similar elements implicitly
produce such entries. It is also possible to include explicit additional
entries in the index.
</Section>
<Section Label="sec:faq"><Heading>Some questions</Heading>
<List>
<Mark>Are those XML files too ugly to read and edit?</Mark>
<Item>
Just have a look and decide yourself. The markup needs more characters
than most &TeX; or &LaTeX; markup. But the structure of the document is
easier to see. If you configure your favorite editor well, you do not need
more key strokes for typing the markup than in &LaTeX;.
</Item>
<Mark>Why do we not use &LaTeX; alone?</Mark>
<Item>
&LaTeX; is good for writing books. But &LaTeX; files are generally
difficult to parse and to process to other output formats like text
for browsing in a terminal window or HTML (or new formats which may
become popular in the future). &GAPDoc; markup is one step more
abstract than &LaTeX; insofar as it describes meaning instead of
appearance of text. The inner workings of &LaTeX; are too complicated
to learn without pain, which makes it difficult to overcome problems
that occur occasionally.
</Item>
<Mark>Why XML and not a newly defined markup language?</Mark>
<Item>
XML is a well defined standard that is more and more widely used. Lots
of people have thought about it. Years of experience with SGML went into the
design. It is easy to explain, easy to parse and lots of tools are available,
there will be more in the future.
</Item>
</List>
</Section>
</Chapter>
|