File: API.html

package info (click to toggle)
xmldiff 0.6.8-2
  • links: PTS
  • area: main
  • in suites: etch, etch-m68k
  • size: 516 kB
  • ctags: 304
  • sloc: python: 2,000; ansic: 256; xml: 202; sh: 108; makefile: 69
file content (240 lines) | stat: -rw-r--r-- 8,120 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta name="language" content="fr">
<meta name="author" content="Logilab">
<meta name="organization" content="Logilab S.A.">
<meta name="generator" content="Logilab Powerful Stylesheets (v3)">
<title>XmlDiff API</title>
<meta name="keywords" content="logilab">
<link rel="stylesheet" href="http://www.logilab.fr/lglb-publi-content.css" type="text/css">
<link rel="stylesheet" href="http://www.logilab.fr/lglb-publi-structure.css" type="text/css">
</head>
<body>
<table class="header" cellspacing="0"><tbody><tr>
<td class="logo"><a href="http://www.logilab.fr/"><img src="http://www.logilab.fr/images/logilab.png" alt="Logilab" height="75"></a></td>
<td class="text"><div class="header-title">XmlDiff API</div></td>
</tr></tbody></table>
<div class="header-sep"></div>
<table class="main" cellspacing="0"><tbody><tr>
<td class="left-margin"></td>
<td class="body">
<div class="component-title-block"><div class="component-title">XmlDiff API</div></div>


<div class="sect1-title">
<a name="id2321560"></a>1. Contents</div>

<ul class="list">
<li class="listitem">
<div class="para">
<span class="error-message"> Link or reference ("mydifflib-py") to an inexistant target. </span>mydifflib.py</div>
</li>
<li class="listitem">
<div class="para">
<span class="error-message"> Link or reference ("input-py") to an inexistant target. </span>input.py</div>
</li>
<li class="listitem">
<div class="para">
<span class="error-message"> Link or reference ("fmes-py") to an inexistant target. </span>fmes.py</div>
</li>
<li class="listitem">
<div class="para">
<span class="error-message"> Link or reference ("ezs-py-depricated") to an inexistant target. </span>ezs.py ** DEPRICATED **</div>
</li>
<li class="listitem">
<div class="para">
<span class="error-message"> Link or reference ("format-py") to an inexistant target. </span>format.py</div>
</li>
</ul>

<div class="para">To use this package as a librarie, you need the provided python's
modules described below.</div>
<div class="sect1-title">
<a name="id2321641"></a>2. mydifflib.py</div>

<div class="para">provides functions for Longest Common Subsequence calculation.</div>
<div class="variablelist">
<div class="varlistentry">
<div class="varterm">
<span class="varname">lcs2(X, Y, equal):</span>
</div>
<div class="varlistitem">
<div class="para">apply the greedy lcs/ses algorithm between X and Y sequence
(should be any Python's sequence)
equal is a function to compare X and Y which must return 0 (or
a Python false value) if X and Y are different, 1 (or Python
true value) if they are identical
return a list of matched pairs in tuples</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">lcsl(X, Y, equal):</span>
</div>
<div class="varlistitem">
<div class="para">same as above but return the length of the lcs</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">quick_ratio(a,b):</span>
</div>
<div class="varlistitem">
<div class="para">optimized version of the standard difflib.py quick_ratio
(without junk and class)
return an upper bound on ratio() relatively quickly.</div>
</div>
</div>
</div>

<div class="sect1-title">
<a name="id2289679"></a>3. input.py</div>

<div class="para">provides functions for converting DOM tree or xml file in order to
process it with xmldiff functions.</div>
<div class="variablelist">
<div class="varlistentry">
<div class="varterm">
<span class="varname">tree_from_stream(stream, norm_sp=1, ext_ges=0, ext_pes=0, include_comment=1, encoding='UTF-8'):</span>
</div>
<div class="varlistitem">
<div class="para">create and return  internal tree from xml stream (open file or
IOString)
if norm_sp = 1, normalize space and new line
if ext_ges = 1, include all external general (text) entities.
if ext_pes = 1, include all external parameter entities, including the external DTD subset.
if include_comment = 1, include comment nodes
encoding specify the encoding to use</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">tree_from_dom(root):</span>
</div>
<div class="varlistitem">
<div class="para">create and return internal tree from DOM subtree</div>
</div>
</div>
</div>

<div class="sect1-title">
<a name="id2289732"></a>4. fmes.py</div>

<div class="para">Fast match/ Edit script algorithm (not sure to obtain the minimum edit
cost, but accept big documents).</div>
<div class="para">Warning, the process(oldtree, newtree) function has a side effect:
after call it, oldtree == newtree.</div>
<div class="variablelist">
<div class="varlistentry">
<div class="varterm">
<span class="varname">class FmesCorrector(self, formatter, f=0.6, t=0.5):</span>
</div>
<div class="varlistitem">
<div class="para">class which contains the fmes algorithm
formatter is a class instance which handle the edit script
formatting (see format.py)
f and t are algorithm parameter, 0 &lt; f &lt; 1 and 0.5 &lt; t &lt; 1
in xmldiff, f = 0.59 and t = 0.5</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">FmesCorrector.process_trees(self, tree1, tree2):</span>
</div>
<div class="varlistitem">
<div class="para">launch diff between internal tree tree1 (old xmltree) and
tree2 (new xml tree)
return an actions list</div>
</div>
</div>
</div>

<div class="sect1-title">
<a name="id2289792"></a>5. ezs.py ** DEPRICATED **</div>

<div class="para">Extended Zhang and Shasha algorithm (provide the minimum edit cost,
but too complex to be used with big documents).</div>
<div class="variablelist">
<div class="varlistentry">
<div class="varterm">
<span class="varname">class EzsCorrector(self):</span>
</div>
<div class="varlistitem">
<div class="para">class which contains the ezs algorithm</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">EzsCorrector.process_trees(self, tree1, tree2):</span>
</div>
<div class="varlistitem">
<div class="para">launch diff between internal tree tree1 (old xmltree) and
tree2 (new xml tree)
return an actions list</div>
</div>
</div>
</div>

<div class="sect1-title">
<a name="id2289840"></a>6. format.py</div>

<div class="para">provides classes for converting xmldiff algorithms output to DOM
tree or printing it in native format or xml xupdate format. The
formatter interface is the following :</div>
<div class="variablelist">
<div class="varlistentry">
<div class="varterm">
<span class="varname">class AbstractFormatter:</span>
</div>
<div class="varlistitem">
<div class="para">abstract class designed to be overrinden by concrete
formatters</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">AbstractFormatter.init(self):</span>
</div>
<div class="varlistitem">
<div class="para">method called before the begining of the tree 2 tree
correction</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">AbstractFormatter.add_action(self, action):</span>
</div>
<div class="varlistitem">
<div class="para">method called when an action is added to the edit script</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">AbstractFormatter.format_action(self, action):</span>
</div>
<div class="varlistitem">
<div class="para">method called by end() to format each action in the edit
script
at least this method should be overriden</div>
</div>
</div>
<div class="varlistentry">
<div class="varterm">
<span class="varname">AbstractFormatter.end(self):</span>
</div>
<div class="varlistitem">
<div class="para">method called at the end of the tree 2 tree correction</div>
</div>
</div>
</div>
<div class="para">the concrete classes are InternalPrinter, XUpdatePrinter and
DOMXUpdateFormatter</div>
<div class="para">See xmldiff.py for an use example.</div>

</td>
</tr></tbody></table>
<div class="footer">Tous droits rservs  la socit Logilab S.A.- 10, Rue Louis Vicat- F-75015 PARIS.</div>
</body>
</html>