There are six tools. They are all simple filters, reading information from standard input in one format and writing the same information to standard output in a different format.
| Tool name | Input | Output |
|---|---|---|
| xml2 | XML | Flat |
| html2 | HTML | Flat |
| csv2 | CSV | Flat |
| 2xml | Flat | XML |
| 2html | Flat | HTML |
| 2csv | Flat | CSV |
The ``Flat'' format is specific to these tools. It is a syntax for representing structured markup in a way that makes it easy to process with line-oriented tools. The same format is used for HTML, XML, and CSV; in fact, you can think of html2 as converting HTML to XHTML and running xml2 on the result; likewise 2html and 2xml.
CSV (comma-separated value) files are less expressive than XML or HTML (CSV has no hierarchy), so xml2 | 2csv is a lossy conversion.
To use these tools effectively, it's important to understand the ``Flat'' format. Unfortunately, I'm lazy and sloppy; rather than provide a precise definition of the relationship between XML and ``Flat'', I will simply give you a pile of examples and hope you can generalize correctly. (Good luck!)
| XML | Flat equivalent |
|---|---|
| <thing/> | /thing |
| <thing><subthing/></thing> | /thing/subthing |
| <thing>stuff</thing> | /thing=stuff |
|
<thing> <subthing>substuff</subthing> stuff </thing> |
/thing/subthing=substuff /thing=stuff |
|
<person> <name>Juan Doé</name> <occupation>Zillionaire</occupation> <pet>Dogcow</pet> <address> 123 Camino Real <city>El Dorado</city> <state>AZ</state> <zip>12345</zip> </address> <important/> </person> |
/person/name=Juan Doé /person/occupation=Zillionaire /person/pet=Dogcow /person/address=123 Camino Real /person/address/city=El Dorado /person/address/state=AZ /person/address/zip=12345 /person/important |
|
<collection> <group> <thing>stuff</thing> <thing>stuff</thing> </group> </collection> |
/collection/group/thing=stuff /collection/group/thing /collection/group/thing=stuff |
|
<collection> <group> <thing>stuff</thing> </group> <group> <thing>stuff</thing> </group> </collection> |
/collection/group/thing=stuff /collection/group /collection/group/thing=stuff |
|
<thing> stuff more stuff <other stuff> </thing> |
/thing=stuff /thing= /thing=more stuff /thing=<other stuff> |
| <thing flag="value">stuff</thing> |
/thing/@flag=value /thing=stuff |
|
<?processing instruction?> <thing/> |
/?processing=instruction /thing |
(TO DO: Add equivalent examples for CSV files.)