1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
|
#!/usr/bin/env python
"""Parses a simple but verbose XML representation of a phylogenetic tree, with
elements <clade>, <name>, <param> and <value>. XML attributes are not used,
so the syntax of parameter names is not restricted at all.
Newick
------
((a,b:3),c);
XML
---
<clade>
<clade>
<clade>
<name>a</name>
</clade>
<clade>
<name>b</name>
<param><name>length</name><value>3.0</value></param>
</clade>
</clade>
<clade>
<name>c</name>
</clade>
</clade>
Parameters are inherited by contained clades unless overridden.
"""
import xml.sax
__author__ = "Peter Maxwell"
__copyright__ = "Copyright 2007-2012, The Cogent Project"
__credits__ = ["Peter Maxwell", "Gavin Huttley"]
__license__ = "GPL"
__version__ = "1.5.3"
__maintainer__ = "Peter Maxwell"
__email__ = "pm67nz@gmail.com"
__status__ = "Production"
class TreeHandler(xml.sax.ContentHandler):
def __init__(self, tree_builder):
self.build_edge = tree_builder
def startDocument(self):
self.stack = [({}, None, None)]
self.data = {'clades':[], 'params':{}}
self.in_clade = False
self.current = None
def startElement(self, name, attrs):
self.parent = self.data
self.stack.append((self.data, self.in_clade, self.current))
self.current = ""
if name == "clade":
self.data = {'params':self.data['params'].copy(), 'clades':[], 'name':None}
self.in_clade = True
else:
self.data = {}
self.in_clade = False
def characters(self, text):
self.current += str(text)
def endElement(self, name):
getattr(self, 'process_%s' % name)(self.current, **self.data)
(self.data, self.in_clade, self.current) = self.stack.pop()
self.parent = self.stack[-1][0]
def endDocument(self):
pass
def process_clade(self, text, name, params, clades):
edge = self.build_edge(clades, name, params)
self.parent['clades'].append(edge)
def process_param(self, text, name, value):
self.parent['params'][name] = value
def process_name(self, text):
self.parent['name'] = text.strip()
def process_value(self, text):
if text == "None":
self.parent['value'] = None
else:
self.parent['value'] = float(text)
def parse_string(text, tree_builder):
handler = TreeHandler(tree_builder)
xml.sax.parseString(text, handler)
trees = handler.data['clades']
assert len(trees) == 1, trees
return trees[0]
|