1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606
|
<html>
<body>
<h1 align='right'><a name='BASICS'><img src="2.gif" align="right"
hspace="10" width="100" height="100" alt="2"></a>Getting Started
with Mini-XML</h1>
<p>This chapter describes how to write programs that use Mini-XML to
access data in an XML file. Mini-XML provides the following
functionality:</p>
<ul>
<li>Functions for creating and managing XML documents
in memory.</li>
<li>Reading of UTF-8 and UTF-16 encoded XML files and
strings.</li>
<li>Writing of UTF-8 encoded XML files and strings.</li>
<li>Support for arbitrary element names, attributes, and
attribute values with no preset limits, just available
memory.</li>
<li>Support for integer, real, opaque ("CDATA"), and text
data types in "leaf" nodes.</li>
<li>"Find", "index", and "walk" functions for easily
accessing data in an XML document.</li>
</ul>
<p>Mini-XML doesn't do validation or other types of processing
on the data based upon schema files or other sources of
definition information, nor does it support character entities
other than those required by the XML specification.</p>
<h2>The Basics</h2>
<p>Mini-XML provides a single header file which you include:</p>
<pre>
#include <mxml.h>
</pre>
<p>The Mini-XML library is included with your program using the
<kbd>-lmxml</kbd> option:</p>
<pre>
<kbd>gcc -o myprogram myprogram.c -lmxml ENTER</kbd>
</pre>
<p>If you have the <tt>pkg-config(1)</tt> software installed,
you can use it to determine the proper compiler and linker options
for your installation:</p>
<pre>
<kbd>pkg-config --cflags mxml ENTER</kbd>
<kbd>pkg-config --libs mxml ENTER</kbd>
</pre>
<h2>Nodes</h2>
<p>Every piece of information in an XML file is stored in memory in "nodes".
Nodes are defined by the <a href='#mxml_node_t'><tt>mxml_node_t</tt></a>
structure. Each node has a typed value, optional user data, a parent node,
sibling nodes (previous and next), and potentially child nodes.</p>
<p>For example, if you have an XML file like the following:</p>
<pre>
<?xml version="1.0" encoding="utf-8"?>
<data>
<node>val1</node>
<node>val2</node>
<node>val3</node>
<group>
<node>val4</node>
<node>val5</node>
<node>val6</node>
</group>
<node>val7</node>
<node>val8</node>
</data>
</pre>
<p>the node tree for the file would look like the following in memory:</p>
<pre>
?xml version="1.0" encoding="utf-8"?
|
data
|
node - node - node - group - node - node
| | | | | |
val1 val2 val3 | val7 val8
|
node - node - node
| | |
val4 val5 val6
</pre>
<p>where "-" is a pointer to the sibling node and "|" is a pointer
to the first child or parent node.</p>
<p>The <a href="#mxmlGetType"><tt>mxmlGetType</tt></a> function gets the type of
a node, one of <tt>MXML_CUSTOM</tt>, <tt>MXML_ELEMENT</tt>,
<tt>MXML_INTEGER</tt>, <tt>MXML_OPAQUE</tt>, <tt>MXML_REAL</tt>, or
<tt>MXML_TEXT</tt>. The parent and sibling nodes are accessed using the
<a href="#mxmlGetParent"><tt>mxmlGetParent</tt></a>,
<a href="#mxmlGetNext"><tt>mxmlGetNext</tt></a>, and
<a href="#mxmlGetPrevious"><tt>mxmlGetPrevious</tt></a> functions. The
<a href="#mxmlGetUserData"><tt>mxmlGetUserData</tt></a> function gets any user
data associated with the node.</p>
<h3>CDATA Nodes</h3>
<p>CDATA (<tt>MXML_ELEMENT</tt>) nodes are created using the
<a href="#mxmlNewCDATA"><tt>mxmlNewCDATA</tt></a> function. The
<a href="#mxmlGetCDATA"><tt>mxmlGetCDATA</tt></a> function retrieves the
CDATA string pointer for a node.</p>
<blockquote><b>Note:</b>
<p>CDATA nodes are currently stored in memory as special elements. This will
be changed in a future major release of Mini-XML.</p>
</blockquote>
<h3>Custom Nodes</h3>
<p>Custom (<tt>MXML_CUSTOM</tt>) nodes are created using the
<a href="#mxmlNewCustom"><tt>mxmlNewCustom</tt></a> function or using a custom
load callback specified using the
<a href="#mxmlSetCustomHandlers"><tt>mxmlSetCustomHandlers</tt></a> function.
The <a href="#mxmlGetCustom"><tt>mxmlGetCustom</tt></a> function retrieves the
custom value pointer for a node.</p>
<h3>Comment Nodes</h3>
<p>Comment (<tt>MXML_ELEMENT</tt>) nodes are created using the
<a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
<a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
comment string pointer for a node, including the surrounding "!--" and "--"
characters.</p>
<blockquote><b>Note:</b>
<p>Comment nodes are currently stored in memory as special elements. This will
be changed in a future major release of Mini-XML.</p>
</blockquote>
<h3>Element Nodes</h3>
<p>Element (<tt>MXML_ELEMENT</tt>) nodes are created using the
<a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
<a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
element name, the
<a href="#mxmlElementGetAttr"><tt>mxmlElementGetAttr</tt></a> function retrieves
the value string for a named attribute associated with the element, and the
<a href="#mxmlGetFirstChild"><tt>mxmlGetFirstChild</tt></a> and
<a href="#mxmlGetLastChild"><tt>mxmlGetLastChild</tt></a> functions retrieve the
first and last child nodes for the element, respectively.</p>
<h3>Integer Nodes</h3>
<p>Integer (<tt>MXML_INTEGER</tt>) nodes are created using the
<a href="#mxmlNewInteger"><tt>mxmlNewInteger</tt></a> function. The
<a href="#mxmlGetInteger"><tt>mxmlGetInteger</tt></a> function retrieves the
integer value for a node.</p>
<h3>Opaque Nodes</h3>
<p>Opaque (<tt>MXML_OPAQUE</tt>) nodes are created using the
<a href="#mxmlNewOpaque"><tt>mxmlNewOpaque</tt></a> function. The
<a href="#mxmlGetOpaque"><tt>mxmlGetOpaque</tt></a> function retrieves the
opaque string pointer for a node. Opaque nodes are like string nodes but
preserve all whitespace between nodes.</p>
<h3>Text Nodes</h3>
<p>Text (<tt>MXML_TEXT</tt>) nodes are created using the
<a href="#mxmlNewText"><tt>mxmlNewText</tt></a> and
<a href="#mxmlNewTextf"><tt>mxmlNewTextf</tt></a> functions. Each text node
consists of a text string and (leading) whitespace value - the
<a href="#mxmlGetText"><tt>mxmlGetText</tt></a> function retrieves the
text string pointer and whitespace value for a node.</p>
<!-- NEED 12 -->
<h3>Processing Instruction Nodes</h3>
<p>Processing instruction (<tt>MXML_ELEMENT</tt>) nodes are created using the
<a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
<a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
processing instruction string for a node, including the surrounding "?"
characters.</p>
<blockquote><b>Note:</b>
<p>Processing instruction nodes are currently stored in memory as special
elements. This will be changed in a future major release of Mini-XML.</p>
</blockquote>
<h3>Real Number Nodes</h3>
<p>Real number (<tt>MXML_REAL</tt>) nodes are created using the
<a href="#mxmlNewReal"><tt>mxmlNewReal</tt></a> function. The
<a href="#mxmlGetReal"><tt>mxmlGetReal</tt></a> function retrieves the
CDATA string pointer for a node.</p>
<!-- NEED 15 -->
<h3>XML Declaration Nodes</h3>
<p>XML declaration (<tt>MXML_ELEMENT</tt>) nodes are created using the
<a href="#mxmlNewXML"><tt>mxmlNewXML</tt></a> function. The
<a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
XML declaration string for a node, including the surrounding "?" characters.</p>
<blockquote><b>Note:</b>
<p>XML declaration nodes are currently stored in memory as special elements.
This will be changed in a future major release of Mini-XML.</p>
</blockquote>
<!-- NEW PAGE -->
<h2>Creating XML Documents</h2>
<p>You can create and update XML documents in memory using the
various <tt>mxmlNew</tt> functions. The following code will
create the XML document described in the previous section:</p>
<pre>
mxml_node_t *xml; /* <?xml ... ?> */
mxml_node_t *data; /* <data> */
mxml_node_t *node; /* <node> */
mxml_node_t *group; /* <group> */
xml = mxmlNewXML("1.0");
data = mxmlNewElement(xml, "data");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val1");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val2");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val3");
group = mxmlNewElement(data, "group");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val4");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val5");
node = mxmlNewElement(group, "node");
mxmlNewText(node, 0, "val6");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val7");
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val8");
</pre>
<!-- NEED 6 -->
<p>We start by creating the declaration node common to all XML files using the
<a href="#mxmlNewXML"><tt>mxmlNewXML</tt></a> function:</p>
<pre>
xml = mxmlNewXML("1.0");
</pre>
<p>We then create the <tt><data></tt> node used for this document using
the <a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The first
argument specifies the parent node (<tt>xml</tt>) while the second specifies the
element name (<tt>data</tt>):</p>
<pre>
data = mxmlNewElement(xml, "data");
</pre>
<p>Each <tt><node>...</node></tt> in the file is created using the
<tt>mxmlNewElement</tt> and <a href="#mxmlNewText"><tt>mxmlNewText</tt></a>
functions. The first argument of <tt>mxmlNewText</tt> specifies the parent node
(<tt>node</tt>). The second argument specifies whether whitespace appears before
the text - 0 or false in this case. The last argument specifies the actual text
to add:</p>
<pre>
node = mxmlNewElement(data, "node");
mxmlNewText(node, 0, "val1");
</pre>
<p>The resulting in-memory XML document can then be saved or processed just like
one loaded from disk or a string.</p>
<!-- NEED 15 -->
<h2>Loading XML</h2>
<p>You load an XML file using the <a
href='#mxmlLoadFile'><tt>mxmlLoadFile</tt></a>
function:</p>
<pre>
FILE *fp;
mxml_node_t *tree;
fp = fopen("filename.xml", "r");
tree = mxmlLoadFile(NULL, fp,
MXML_TEXT_CALLBACK);
fclose(fp);
</pre>
<p>The first argument specifies an existing XML parent node, if
any. Normally you will pass <tt>NULL</tt> for this argument
unless you are combining multiple XML sources. The XML file must
contain a complete XML document including the <tt>?xml</tt>
element if the parent node is <tt>NULL</tt>.</p>
<p>The second argument specifies the stdio file to read from, as
opened by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
<tt>stdin</tt> if you are implementing an XML filter
program.</p>
<p>The third argument specifies a callback function which returns
the value type of the immediate children for a new element node:
<tt>MXML_CUSTOM</tt>, <tt>MXML_IGNORE</tt>,
<tt>MXML_INTEGER</tt>, <tt>MXML_OPAQUE</tt>, <tt>MXML_REAL</tt>,
or <tt>MXML_TEXT</tt>. Load callbacks are described in detail in
<a href='#LOAD_CALLBACKS'>Chapter 3</a>. The example code uses
the <tt>MXML_TEXT_CALLBACK</tt> constant which specifies that all
data nodes in the document contain whitespace-separated text
values. Other standard callbacks include
<tt>MXML_IGNORE_CALLBACK</tt>, <tt>MXML_INTEGER_CALLBACK</tt>,
<tt>MXML_OPAQUE_CALLBACK</tt>, and
<tt>MXML_REAL_CALLBACK</tt>.</p>
<p>The <a href='#mxmlLoadString'><tt>mxmlLoadString</tt></a>
function loads XML node trees from a string:</p>
<!-- NEED 10 -->
<pre>
char buffer[8192];
mxml_node_t *tree;
...
tree = mxmlLoadString(NULL, buffer,
MXML_TEXT_CALLBACK);
</pre>
<p>The first and third arguments are the same as used for
<tt>mxmlLoadFile()</tt>. The second argument specifies the
string or character buffer to load and must be a complete XML
document including the <tt>?xml</tt> element if the parent node
is <tt>NULL</tt>.</p>
<!-- NEED 15 -->
<h2>Saving XML</h2>
<p>You save an XML file using the <a
href='#mxmlSaveFile'><tt>mxmlSaveFile</tt></a> function:</p>
<pre>
FILE *fp;
mxml_node_t *tree;
fp = fopen("filename.xml", "w");
mxmlSaveFile(tree, fp, MXML_NO_CALLBACK);
fclose(fp);
</pre>
<p>The first argument is the XML node tree to save. It should
normally be a pointer to the top-level <tt>?xml</tt> node in
your XML document.</p>
<p>The second argument is the stdio file to write to, as opened
by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
<tt>stdout</tt> if you are implementing an XML filter
program.</p>
<p>The third argument is the whitespace callback to use when
saving the file. Whitespace callbacks are covered in detail in <a
href='SAVE_CALLBACKS'>Chapter 3</a>. The previous example code
uses the <tt>MXML_NO_CALLBACK</tt> constant to specify that no
special whitespace handling is required.</p>
<p>The <a
href='#mxmlSaveAllocString'><tt>mxmlSaveAllocString</tt></a>,
and <a href='#mxmlSaveString'><tt>mxmlSaveString</tt></a>
functions save XML node trees to strings:</p>
<pre>
char buffer[8192];
char *ptr;
mxml_node_t *tree;
...
mxmlSaveString(tree, buffer, sizeof(buffer),
MXML_NO_CALLBACK);
...
ptr = mxmlSaveAllocString(tree, MXML_NO_CALLBACK);
</pre>
<p>The first and last arguments are the same as used for
<tt>mxmlSaveFile()</tt>. The <tt>mxmlSaveString</tt> function
takes pointer and size arguments for saving the XML document to
a fixed-size buffer, while <tt>mxmlSaveAllocString()</tt>
returns a string buffer that was allocated using
<tt>malloc()</tt>.</p>
<!-- NEED 15 -->
<h3>Controlling Line Wrapping</h3>
<p>When saving XML documents, Mini-XML normally wraps output
lines at column 75 so that the text is readable in terminal
windows. The <a
href='#mxmlSetWrapMargin'><tt>mxmlSetWrapMargin</tt></a> function
overrides the default wrap margin:</p>
<pre>
/* Set the margin to 132 columns */
mxmlSetWrapMargin(132);
/* Disable wrapping */
mxmlSetWrapMargin(0);
</pre>
<h2>Memory Management</h2>
<p>Once you are done with the XML data, use the <a
href="#mxmlDelete"><tt>mxmlDelete</tt></a> function to recursively
free the memory that is used for a particular node or the entire
tree:</p>
<pre>
mxmlDelete(tree);
</pre>
<p>You can also use reference counting to manage memory usage. The
<a href="#mxmlRetain"><tt>mxmlRetain</tt></a> and
<a href="#mxmlRelease"><tt>mxmlRelease</tt></a> functions increment and
decrement a node's use count, respectively. When the use count goes to 0,
<tt>mxmlRelease</tt> will automatically call <tt>mxmlDelete</tt> to actually
free the memory used by the node tree. New nodes automatically start with a
use count of 1.</p>
<!-- NEW PAGE-->
<h2>Finding and Iterating Nodes</h2>
<p>The <a
href='#mxmlWalkPrev'><tt>mxmlWalkPrev</tt></a>
and <a
href='#mxmlWalkNext'><tt>mxmlWalkNext</tt></a>functions
can be used to iterate through the XML node tree:</p>
<pre>
mxml_node_t *node;
node = mxmlWalkPrev(current, tree,
MXML_DESCEND);
node = mxmlWalkNext(current, tree,
MXML_DESCEND);
</pre>
<p>In addition, you can find a named element/node using the <a
href='#mxmlFindElement'><tt>mxmlFindElement</tt></a>
function:</p>
<pre>
mxml_node_t *node;
node = mxmlFindElement(tree, tree, "name",
"attr", "value",
MXML_DESCEND);
</pre>
<p>The <tt>name</tt>, <tt>attr</tt>, and <tt>value</tt>
arguments can be passed as <tt>NULL</tt> to act as wildcards,
e.g.:</p>
<!-- NEED 4 -->
<pre>
/* Find the first "a" element */
node = mxmlFindElement(tree, tree, "a",
NULL, NULL,
MXML_DESCEND);
</pre>
<!-- NEED 5 -->
<pre>
/* Find the first "a" element with "href"
attribute */
node = mxmlFindElement(tree, tree, "a",
"href", NULL,
MXML_DESCEND);
</pre>
<!-- NEED 6 -->
<pre>
/* Find the first "a" element with "href"
to a URL */
node = mxmlFindElement(tree, tree, "a",
"href",
"http://www.easysw.com/",
MXML_DESCEND);
</pre>
<!-- NEED 5 -->
<pre>
/* Find the first element with a "src"
attribute */
node = mxmlFindElement(tree, tree, NULL,
"src", NULL,
MXML_DESCEND);
</pre>
<!-- NEED 5 -->
<pre>
/* Find the first element with a "src"
= "foo.jpg" */
node = mxmlFindElement(tree, tree, NULL,
"src", "foo.jpg",
MXML_DESCEND);
</pre>
<p>You can also iterate with the same function:</p>
<pre>
mxml_node_t *node;
for (node = mxmlFindElement(tree, tree,
"name",
NULL, NULL,
MXML_DESCEND);
node != NULL;
node = mxmlFindElement(node, tree,
"name",
NULL, NULL,
MXML_DESCEND))
{
... do something ...
}
</pre>
<!-- NEED 10 -->
<p>The <tt>MXML_DESCEND</tt> argument can actually be one of
three constants:</p>
<ul>
<li><tt>MXML_NO_DESCEND</tt> means to not to look at any
child nodes in the element hierarchy, just look at
siblings at the same level or parent nodes until the top
node or top-of-tree is reached.
<p>The previous node from "group" would be the "node"
element to the left, while the next node from "group" would
be the "node" element to the right.<br><br></p></li>
<li><tt>MXML_DESCEND_FIRST</tt> means that it is OK to
descend to the first child of a node, but not to descend
further when searching. You'll normally use this when
iterating through direct children of a parent node, e.g. all
of the "node" and "group" elements under the "?xml" parent
node in the example above.
<p>This mode is only applicable to the search function; the
walk functions treat this as <tt>MXML_DESCEND</tt> since
every call is a first time.<br><br></p></li>
<li><tt>MXML_DESCEND</tt> means to keep descending until
you hit the bottom of the tree. The previous node from
"group" would be the "val3" node and the next node would
be the first node element under "group".
<p>If you were to walk from the root node "?xml" to the end
of the tree with <tt>mxmlWalkNext()</tt>, the order would
be:</p>
<p><tt>?xml data node val1 node val2 node val3 group node
val4 node val5 node val6 node val7 node val8</tt></p>
<p>If you started at "val8" and walked using
<tt>mxmlWalkPrev()</tt>, the order would be reversed,
ending at "?xml".</p></li>
</ul>
<h2>Finding Specific Nodes</h2>
<p>You can find specific nodes in the tree using the <a
href='#mxmlFindValue'><tt>mxmlFindPath</tt></a>, for example:
<pre>
mxml_node_t *value;
value = mxmlFindPath(tree, "path/to/*/foo/bar");
</pre>
<p>The second argument is a "path" to the parent node. Each component of the
path is separated by a slash (/) and represents a named element in the document
tree or a wildcard (*) path representing 0 or more intervening nodes.</p>
</body>
</html>
|