1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205
|
<?xml version="1.0" encoding="ISO-8859-1"?>
<?IS10744:arch name ="persons"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "">
<html>
<head>
<title>xmlarch.py: An XML architectural forms processor</title>
<meta name="Author" content="Geir Ove Grnmo"></meta>
</head>
<body>
<h1>xmlarch.py: An XML architectural forms processor</h1>
<table bgcolor="yellow">
<tr bgcolor="#FF0010"><th align="left">Version:</th><td>0.11</td></tr>
<tr><th align="left">Author:</th><td persons="author">Geir Ove Grnmo</td></tr>
<tr><th align="left">Email:</th><td>grove@infotek.no</td></tr>
<tr><th align="left">Released:</th><td>September 15th 1998</td></tr>
</table>
<h2>What's new?</h2>
<h3>Version 0.11</h3>
<small>
<p>There are no new features in this release. The module should now be placed in the xml.arch package. Unfortunately you have to move xmlarch.py yourself after installation. The demo tools have been updated to support this new package structure.</p>
<p>Problem with <code><?IS10744 arch ...?></code> not being recognized as an architecture use declaration is now fixed. Now both <code><?IS10744:arch ...?></code> and <code><?IS10744 arch ...?></code> are supported.</p>
<p>get_bridge_form() was called get_bridge_elem_form() a couple of places. This now fixed.</p>
</small>
<h2>What is xmlarch.py?</h2>
<p>The xmlarch module contains an <a href="http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html">XML architectural forms</a> processor written in <a href="http://www.python.org/">Python</a>. It allows you to process XML architectural forms using any parser that uses the <a href="http://www.ifi.uio.no/~larsga/download/python/xml/saxlib.html">SAX interfaces</a>. The module allow you to process several architectures in one parse pass. Architectural document events for an architecture can even be broadcasted to multiple DocumentHandlers. (e.g. you can have 2 handlers for the RDF architecture, 3 for the XLink architecture and perhaps one for the HyTime architecture.)</p>
<p >The architecture processor uses the SAX DocumentHandler interface which means that you can register the architecture handler (ArchDocHandler) with any SAX 1.0 compliant parser.</p>
<p>It currently does not process any meta document type definition documents (DTD). When a DTD parser module is avaliable I will use that in order to process meta DTD information.</p>
<p>Please note that validating and well-formed parsers may report different SAX events when parsing documents.</p>
<h2>What does the xmlarch module contain?</h2>
<p>xmlarch.py contains six classes ArchDocHandler, Architecture, ArchParseState, ArchException, AttributeParser and Normalizer.</p>
<ul>
<li><b>ArchDocHandler</b> is a subclass of the saxlib.DocumentHandler interface. This is the class used for provessing an architectural document.</li>
<li><b>Architecture</b> contains information about an architecture.</li>
<li><b>ArchParseState</b> holds information about an architecture's parse state when parsing a document.</li>
<li><b>AttributeParser</b> parses architecture use declaration PIs (attribute strings).</li>
<li><b>ArchException</b> holds information about an architectural exception thrown by the ArchDocHandler</li>
<li><b>Normalizer</b> is a document handler that outputs "normalized" XML.</li>
</ul>
<h2>Using the xmlarch module</h2>
<p>Using the xmlarch module usually means that you have to do the following things:</p>
<ul>
<li>Import the required SAX modules; saxexts, saxlib, saxutils.</li>
<li>Import the xmlarch module.</li>
<li>Create a SAX compliant parser object.</li>
<li>Create an XML architectures processor handler.</li>
<li>Register this handler with the parser.</li>
<li>Add document handlers for the architectures you want to process.</li>
<li>Register a default document handler with the architecture processor handler.</li>
<li>Parse a document.</li>
</ul>
<h2>A simple example</h2>
<table cellpadding="5" bgcolor="#00DD11">
<tr bgcolor="#FF0000"><td>Python code</td></tr>
<tr><td>
<pre><code>
# Import needed modules
from xml.sax import saxexts, saxlib, saxutils
import sys, xmlarch
# Create architecture processor handler
arch_handler = xmlarch.ArchDocHandler()
# Create parser and register architecture processor with it
parser = saxexts.XMLParserFactory.make_parser()
parser.setDocumentHandler(arch_handler)
# Add an document handler to process the html architecture
arch_handler.addArchDocumentHandler("html", xmlarch.Normalizer(sys.stdout))
# Parse (and process) the document
parser.parse("simple.xml")
</code></pre>
</td></tr></table>
<p></p>
<table cellpadding="5" bgcolor="#FFFF00">
<tr bgcolor="#FF0000"><td>XML document</td></tr>
<tr><td><pre><code>
<?xml version="1.0"?>
<?IS10744:arch name="html"?>
<doc>
<title html="h1">My first architectual document</title>
<author html="address">Geir Ove Gronmo, grove@infotek.no</author>
<para>This is the first paragraph in this document</para>
<para html="p">This is the second paragraph</para>
</doc>
</code></pre>
</td></tr></table>
<table cellpadding="5" bgcolor="#FFFF00">
<tr bgcolor="#FF0000"><td>Result</td></tr>
<tr><td><pre><code>
<html>
<h1>My first architectual document</h1>
<address>Geir Ove Gronmo, grove@infotek.no</address>
<p>This is the second paragraph</p>
</html>
</code></pre>
</td></tr></table>
<p>See also the files simple.py and simple.xml in the distribution.</p>
<p>If you try to process the persons architecture in <a href="http://www.infotek.no/~grove/software/xmlarch/xmlarch.html">this document</a> instead you get the following output:</p>
<table cellpadding="5" bgcolor="#FFFF00">
<tr bgcolor="#FF0000"><td>Result</td></tr>
<tr><td><pre><code>
<persons>
<author>Geir Ove Grnmo</author><mentioned>Eliot Kimber</mentioned><mentioned>David Megginson</mentioned><mentioned>Lars Marius Garshol</mentioned>
</persons>
</code></pre>
</td></tr></table>
<h2>A more complex example</h2>
<table cellpadding="5" bgcolor="#00DD11">
<tr bgcolor="#FF0000"><td>Python code</td></tr>
<tr><td>
<pre><code>
# Import needed modules
from xml.sax import saxexts, saxlib, saxutils
import sys, xmlarch
# create architecture processor handler
arch_handler = xmlarch.ArchDocHandler()
# Create parser and register architecture processor with it
parser = saxexts.XMLParserFactory.make_parser()
parser.setDocumentHandler(arch_handler)
# Add an document handlers to process the html and biblio architectures
arch_handler.addArchDocumentHandler("html", xmlarch.Normalizer(open("html.out", "w")))
arch_handler.addArchDocumentHandler("biblio", saxutils.ESISDocHandler(open("biblio1.out", "w")))
arch_handler.addArchDocumentHandler("biblio", saxutils.Canonizer(open("biblio2.out", "w")))
# Register a default document handler that just passes through any incoming events
arch_handler.setDefaultDocumentHandler(xmlarch.Normalizer(sys.stdout))
# Parse (and process) the document
parser.parse("complex.xml")
</code></pre>
</td></tr></table>
<p>Because this causes a lot of output I've not included the XML document and the results. See instead the files complex.py and complex.xml in the distribution and try it yourself.</p>
<h2>Testing the xmlarch module</h2>
<p>The distribution also contain test scripts. archtest.py can be run on the command line. It needs two arguments. The first is the name of the architecture to process and the second is the XML document to process. The result is printed on stdout as a normalized document. You can also use the --debug flag to tell it to output debug information to stderr.</p>
<p>Example I: <code>python archtest.py html simple.xml</code></p>
<p>Example II: <code>python archtest.py --debug biblio complex.xml</code></p>
<p>Example III: <code>python archtest.py persons http://www.infotek.no/~grove/software/xmlarch/xmlarch.html</code></p>
<h2>Download</h2>
<p>You can get it <a href="xmlarch.zip">here</a>.</p>
<h2>Related information</h2>
<ul>
<li><a href="http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html">Architectural Form Definition Requirements [AFDR]</a></li>
<li><a href="http://www.isogen.com/papers/archintro.html"><i persons="mentioned">Eliot Kimber</i>'s "A Tutorial to SGML Architectures"</a></li>
<li><a href="http://www.sil.org/sgml/topics.html#archForms">The SGML/XML Web Page: Architectural Forms and SGML/XML Architectures</a></li>
<li><a href="http://www.megginson.com/XAF/index.html"><i persons="mentioned">David Megginson</i>'s XAF package for Java</a></li>
<li><a href="http://www.megginson.com/SAX/index.html">SAX: The Simple API for XML</a></li>
<li><a href="http://www.ifi.uio.no/~larsga/download/python/xml/saxlib.html"><i persons="mentioned">Lars Marius Garshol</i>'s SAX for Python</a></li>
</ul>
<h2>Feedback -- bug reports, features and improvements</h2>
<p>
I would very much welcome any feedback on any issue regarding this piece of software. Feedback should be sent to <a href="mailto:grove@infotek.no">grove@infotek.no</a>.
</p>
<hr></hr>
<address>
Last modified by Geir O. Grnmo, grove@infotek.no
</address>
</body>
</html>
|