File: xml.txt

package info (click to toggle)
pike7.8 7.8.866-7
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 69,304 kB
  • ctags: 28,082
  • sloc: ansic: 252,877; xml: 36,537; makefile: 4,214; sh: 2,879; lisp: 655; asm: 591; objc: 212; pascal: 157; sed: 34
file content (569 lines) | stat: -rw-r--r-- 17,535 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
                 +--------------------------------------+
                 | Pike autodoc markup - the XML format |
                 +--------------------------------------+

======================================================================
a) Introduction
----------------------------------------------------------------------

When a piece of documentation is viewed in human-readable format, it 
has gone through the following states:

  1. Doc written in comments in source code (C or Pike).

  2. A lot of smaller XML files, one for each source code file.

  3. A big chunk of XML, describing the whole hierarchy.

  4. A repository of smaller and more manageable XML files.

  5. A HTML page rendered from one such file. 
     (Or a PDF file, or whatever).

The transition from state 1 to state 2 is the extraction of 
documentation from source files. There are several (well, at
least two) markup formats, and there are occasions where it is
handy to generate documentation automatically &c. This document 
describes how a file in state 2 should be structured in order to
be handled correctly by subsequent passes and presented in a
consistent manner.

======================================================================
b) Overall structure
----------------------------------------------------------------------

Each source file adds some number of entities to the whole hierarchy.
It can contain a class or a module. It can contain an empty module, 
that has its methods and members defined in some other source file,
and so on. Suppose we have a file containing documentation for the
class Class in the module Module. The XML skeleton of the file 
would then be:

  <module name="">
      <module name="Module">
          <class name="Class">
              ... perhaps some info on inherits, members &c ...
              <doc>
                  ... the documentation of the class Module.Class ...
              </doc>
          </class>
      </module>
  </module>

The <module name=""> refers to the top module. That element, and its 
child <module name="Module">, exist only to put the <class name="Class"> 
in its correct position in the hierarchy. So we can divide the elements
in the XML file into two groups: skeletal elements and content elements. 

Each actual module/class/whatever in the Pike hierarchy maps to at most
one content element, however it can map to any number of skeletal elements. 
For example, the top module is mapped to a skeletal element in each XML
file extracted from a single source file. To get from state 2 to state 3 
in the list above, all XML files are merged into one big. All the elements
that a module or class map to are merged into one, and if one of those 
elements contains documentation (=is a content element), then that
documentation becomes a child of the merger of the elements.

======================================================================
c) Grouping
----------------------------------------------------------------------

Classes and modules always appear as <module> and <class> elements. 
Methods, variables, constants &c, however, can be grouped in the 
source code:

  //! Two variables:
  int a;
  int b;
  
Even a single variable is considered as a group with one member. 
Continuing the example in the previous section, suppose that Module.Class
has two member variables, a and b, that are documented as a group:

  <module name="">
    <module name="Module">
      <class name="Class">
          ... perhaps some info on inherits, members &c ...

        <docgroup homogen-type="variable">
          <variable name="a"><type><int/></type></variable>
          <variable name="b"><type><int/></type></variable>
          <doc> 
            ... documentation for Module.Class.a and Module.Class.b ...
          </doc>
        </docgroup>

        <doc>
           ... the documentation of the class Module.Class ...
        </doc>
      </class>
    </module>
  </module>

If all the children of a <docgroup> are of the same type, e.g. all are 
<method> elements, then the <docgroup> has the attribute homogen-type 
(="method" in the example). If all the children have identical name="..." 
attributes, then the <docgroup> gets a homogen-name="..." attribute aswell.

The <docgroup> has a <doc> child containing the docmentation for the other 
children of the <docgroup>. An entity that cannot be grouped (class, module,
enum), has a <doc> child of its own instead.

======================================================================
d) Pike entities
----------------------------------------------------------------------

Pike entities - classes, modules, methods, variables, constants, &c, have some
things in common, and many parts of the xml format are the same for all of
these entities. All entities are represented with an XML element, namely one 
of:

  <class>
  <constant>
  <enum>
  <inherit>
  <method>
  <modifier>    
  <module>
  <typedef>
  <variable> 

The names speak for themselves, except: <modifier> which is used for modifier 
ranges: 

  //! Some variables:
  protected final {
    int x, y; 

    string n;  
  }

A Pike entity may also have the following properties:

  Name - Given as a name="..." attribute:
    <variable name="i"> ... </variable>

  Modifiers - Given as a child element <modifiers>:
    <variable name="i">
      <modifiers>
        <optional/><static/><private/>
      </modifiers>
      ...
    </variable>
  If there are no modifiers before the declaration of the entity, the
  <modifiers> element can be omitted.

  Source position - Given as a child element <source-position>:
    <variable name="i">
      <source-position file="/home/rolf/hejhopp.pike" first-line="12"/>
      <modifiers>
        <optional/><static/><private/>
      </modifiers>
      ...
    </variable>
  The source position is the place in the code tree where the entity is 
  declared or defined. For a method, the attribute last-line="..." can be
  added to <source-position> to give the range of lines that the method 
  body spans in the source code.

And then there are some things that are specific to each of the types of
entities:

<class>
   All inherits of the class are given as child elements <inherit>. If there
   is doc for the inherits, the <inherit> is repeated inside the appropriate 
   <docgroup>:

     class Bosse { 
       inherit "arne.pike" : Arne; 
       inherit Benny;   

       //! Documented inherit
       inherit Sven;
     }

     <class name="Bosse">
       <inherit name="Arne"><source-position ... />
                            <classname>"arne.pike"</classname></inherit>
       <inherit><source-position ... />
                <classname>Benny</classname></inherit>
       <inherit><source-position ... />
                <classname>Sven</classname></inherit>
       <docgroup homogen-type="inherit">
         <doc>
           <text><p>Documented inherit</p></text>
         </doc>
         <inherit><source-position ... />
                  <classname>Sven</classname></inherit>
       </docgroup>
       ...
     </class>
 
<constant>
   Only has a name. The element is empty (or has a <source-position> child.)

<enum>
   Works as a container. Has a <doc> child element with the documentation of
   the enum itself, and <docgroup> elements with a <constant> for each enum
   constant. So:

     enum E
     //! enum E
     {
       //! Three constants:
       a, b, c,
     
       //! One more:
       d
     }

   becomes:

     <enum name="E">
         <doc><text><p>enum E</p></text></doc>
         <docgroup homogen-type="constant">
             <doc><text><p>Three constants:</p></text></doc>
             <constant name="a"/>
             <constant name="b"/>
             <constant name="c"/>
         </docgroup>
         <docgroup homogen-name="d" homogen-type="constant">
             <doc><text><p>One more:</p></text></doc>
             <constant name="d"/>
         </docgroup>
     </enum>
     
   Both the <enum> element and the <constant> elements could have 
   <source-position> children, of course.

<inherit> 
   The name="..." attribute gives the name after the colon, if any. The name
   of the inherited class is given in a <classname> child. If a file name is
   used, the class name is the file name surrounded by quotes (see <class>).

<method>
   The arguments are given inside an <arguments> child. Each argument is 
   given as an <argument name="..."> element. Each <argument> has a <type>
   child, with the type of the argument. The return type of the method is 
   given inside a <returntype> container:

     int a(int x, int y);
  
     <method name="a">
       <arguments>
         <argument name="x"><type><int/></type></argument>
         <argument name="y"><type><int/></type></argument>
       </arguments>
       <returntype><int/></returntype>
     </method>
                 
<modifier>
   Works as a container ... ???

<module>
   Works just like <class>.

<typedef>
   The type is given in a <type> child:

     typedef float Boat;
     
     <typedef name="Boat"><type><float/></type></typedef>

<variable>
   The type of the variable is given in a <type> child:
     
     int x;

     <variable name="x"><type><int/></type></variable>

======================================================================
e) Pike types
----------------------------------------------------------------------

Above we have seen the types int and float represented as <int/> and <float/>.
Some of the types are complex, some are simple. The simpler types are just on
the form <foo/>:

  <float/>
  <mixed/>
  <program/>
  <void/>

The same goes for mapping, array, function, object, multiset, &c that have 
no narrowing type qualification: <mapping/>, <array/>, <function/> ...

The complex types are represented as follows:

array
   If the type of the elements of the array is specified it is given in a
   <valuetype> child element:

     array(int) 
 
     <array><valuetype><int/></valuetype></array>

function
   The types of the arguments and the return type are given (the order 
   of the <argtype> elements is significant, of course):

     function(int, string: mixed)

     <function>
       <argtype><int/></argtype>
       <argtype><string/></argtype>
       <returntype><mixed/></returntype>
     </function>
   
int
   An int type can have a min and/or max value. The values can be numbers or
   identifiers:

     int(0..MAX)
   
     <int><min>0</min><max>MAX</max></int>

string
   A string type can have a numerical min and/or max character value. 
   The values can be numbers or identifiers:

     string(0..255)

     <string><min>0</min><max>255</max></string>

mapping
   The types of the indices and values are given:

     mapping(int:int)

     <mapping>
       <indextype><int/></indextype>
       <valuetype><int/></valuetype>

multiset
   The type of the indices is given:
  
     multiset(string)

     <multiset>
       <indextype><string/></indextype>
     </multiset>

object 
   If the program/class is specified, it is given as the text child of 
   the <object> element:

     object(Foo.Bar.Ippa)

     <object>Foo.Bar.Ippa</object>

Then there are two special type constructions. A disjunct type is written
with the <or> element:

  string|int

  <or><string/><int/></or>

An argument to a method can be of the varargs type:

  function(string, mixed ... : void)

  <function>
    <argtype><string/></argtype>
    <argtype><varargs><mixed/></varargs></argtype>
    <returntype><void/></returntype>
  </function>

======================================================================
f) Other XML tags
----------------------------------------------------------------------

p
   Paragraph.

i
   Italic.

b
   Bold.

tt
   Terminal Type.

pre
   Preformatted text.

code
   Program code.

image
   An image object. Contains the original file path to the image. Has the
   optional attributes width, height and file, where file is the path to
   the normalized-filename file.

======================================================================
g) XML generated from the doc markup
----------------------------------------------------------------------

The documentation for an entity is put in a <doc> element. The <doc> element 
is either a child of the element representing the entity (in the case of 
<class>, <module>, <enum>, or <modifiers>) or a child of the <docgroup> that
contains the element representing the entity.

The doc markup has two main types of keywords. Those that create a container
and those that create a new subsection within a container, implicitly closing
the previous subsection. Consider e.g.:

  //! @mapping
  //!   @member int "ip"
  //!     The IP# of the host.
  //!   @member string "address"
  //!     The name of the host.
  //!   @member float "latitude"
  //!   @member float "longitude"
  //!     The coordinates of its physical location.
  //! @endmapping

Here @mapping and @endmapping create a container, and each @member start a 
new subsection. The two latter @member are grouped together and thus they
form ONE new subsection together. Each subsection is a <group>, and the 
<group> has one or more <member> children, and a <text> child that contains
the text that describes the <member>s:

  <mapping> 
      <group>
          <member><type><int/></type><index>"ip"</index></member>
          <text>
            <p>The IP# of the host.</p>
          </text>
      </group>
      <group>
          <member><type><string/></type><index>"address"</index></member>
          <text>
              <p>The name of the host.</p>
          </text>
      </group>
      <group>
          <member><type><float/></type><index>"latitude"</index></member>
          <member><type><float/></type><index>"longitude"</index></member>
          <text>
              <p>The coordinates of its physical location.</p>
          </text>
      </group>
  </mapping>

Inside a <text> element, there can not only be text, but also a nested level
of, say @mapping - @endmapping. In that case, the <mapping> element is put in 
the document order place as a sibling of the <p> that contain the text:

  //! @mapping
  //!   @member mapping "nested-mapping"
  //!     A mapping inside the mapping:
  //!     @mapping
  //!       @member string "zip-code"
  //!         The zip code.
  //!     @endmapping
  //!     And some more text ... 
  //! @endmapping
    
  becomes:

  <mapping>
    <group>
      <member><type><mapping/></type><index>"nested-mapping"</index></member>
        <text>
          <p>A mapping inside the mapping:</p>
          <mapping>
            <group>
              <member><type><string/></type><index>"zip-code"</index></member>
              <text>
                <p>The zip code.</p>
              </text>
            </group>
          </mapping>
          <p>And some more text ...</p>
        </text>
      </group>
  </mapping>

Inside the <p> elements, there may also be some more "layout-ish" tags like 
<b>, <code>, <tt>, <i>, needed to make the text more readable. Those tags are
expressed as @i{ ... @} in the doc markup. However there are no <br>. A 
paragraph break is done by ending the <p> and beginning a new. A </p><p> is 
inserted for each sequence of blank lines in the doc markup:

  //! First paragraph.
  //! 
  //! Second paragraph.
  //! 
  //! 

  becomes:

  <p>First paragraph.</p><p>Second paragraph.</p>

Note that the text is trimmed from leading and ending whitespaces, and there 
are never any empty <p> elements.

In the example above the keyword `@mapping' translated into <mapping>, whereas
the keyword `@member string "zip-code"' translated into:
  <member><type><string/></type><index>"zip-code"</index></member>

The translation of keyword->XML is done differently for each keyword. How it
is done can be seen in lib/modules/Tools.pmod/AutoDoc.pmod/DocParser.pmod. Most
keywords just interpret the arguments as a space-separated list, and put their
values in attributes to the element. In some cases (such as @member) though, 
some more intricate parsing must be done, and the arguments may be complex 
(like Pike types) and are represented as child elements of the element. 

======================================================================
h) Top level sections of different Pike entities.
----------------------------------------------------------------------

In every doc comment there is an implicit "top container", and subsections can
be opened in it. E.g.:

  //! A method.
  //! @param x
  //!   The horizontal coordinate.
  //! @param y 
  //!   The vertical coordinate.
  //! @returns
  //!   Nothing :)
  void foo(int x, int y)

becomes:

  <docgroup homogen-name="foo" homogen-type="method">
      <doc>
          <text><p>A method.</p></text>
          <group>
              <param name="x"/>
              <text><p>The horizontal coordinate.</p></text>
          </group>
          <group>
              <param name="y"/>
              <text><p>The vertical coordinate.</p></text>
          </group>
          <group>
              <returns/>
              <text><p>Nothing :)</p></text>
          </group>
      </doc>
      <method name="foo">
         ......
      </method>
  </docgroup>
  
Which "top container" subsections are allowed depends on what type of entity is
documented:

ALL      -  <bugs/>
            <deprecated> ... </deprecated>
            <example/>
            <note/>
            <seealso/>

<method> -  <param name="..."/>
            <returns/>
            <throws/>