File: package.xml

package info (click to toggle)
php-xml-htmlsax3 3.0.0%2Bcvs01112007-2
  • links: PTS
  • area: main
  • in suites: lenny, squeeze, wheezy
  • size: 160 kB
  • ctags: 327
  • sloc: php: 1,060; xml: 152; makefile: 2
file content (167 lines) | stat: -rw-r--r-- 8,419 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE package SYSTEM "http://pear.php.net/dtd/package-1.0">
<package version="1.0">
  <name>XML_HTMLSax3</name>
  <summary>A SAX parser for HTML and other badly formed XML documents</summary>
  <description>XML_HTMLSax3 is a SAX based XML parser for badly formed XML documents, such as HTML.
  The original code base was developed by Alexander Zhukov and published at http://sourceforge.net/projects/phpshelve/. Alexander kindly gave permission to modify the code and license for inclusion in PEAR.

  PEAR::XML_HTMLSax3 provides an API very similar to the native PHP XML extension (http://www.php.net/xml), allowing handlers using one to be easily adapted to the other. The key difference is HTMLSax will not break on badly formed XML, allowing it to be used for parsing HTML documents. Otherwise HTMLSax supports all the handlers available from Expat except namespace and external entity handlers. Provides methods for handling XML escapes as well as JSP/ASP opening and close tags.

  Version 1.x introduced an API similar to the native SAX extension but used a slow character by character approach to parsing.

  Version 2.x has had it's internals completely overhauled to use a Lexer, delivering performance *approaching* that of the native XML extension, as well as a radically improved, modular design that makes adding further functionality easy.

  Version 3.x is about fine tuning the API, behaviour and providing a mechanism to distinguish HTML "quirks" from badly formed HTML (later functionality not yet implemented)

  A big thanks to Jeff Moore (lead developer of WACT: http://wact.sourceforge.net) who's largely responsible for new design, as well input from other members at Sitepoint's Advanced PHP forums: http://www.sitepointforums.com/showthread.php?threadid=121246.

  Thanks also to Marcus Baker (lead developer of SimpleTest: http://www.lastcraft.com/simple_test.php) for sorting out the unit tests.</description>
  <maintainers>
    <maintainer>
      <user>hfuecks</user>
      <name>Harry Fuecks</name>
      <email>hfuecks@phppatterns.com</email>
      <role>lead</role>
    </maintainer>
  </maintainers>
  <release>
    <version>3.0.0RC1</version>
    <date>2004-06-02</date>
    <license>PHP</license>
    <state>beta</state>
    <notes>* Re PEAR version naming rules, you now include XML/HTMLSax3.php and the main class is called XML_HTMLSax3
* Now able to parse Word generated HTML - fixed bug with parsing of XML escape sequences
* API break (minor): no longer extends PEAR
* API break (minor): attributes with no value (like option selected) are now populated with NULL instead of TRUE
* API break (minor): replaced XML_OPTION_FULL_ESCAPES with XML_OPTION_STRIP_ESCAPES - by default you now get back the complete escape sequence
* Added some more examples
</notes>
    <deps>
      <dep type="php" rel="ge" version="4.0.5"/>
      <dep type="ext" rel="has" optional="yes">pcre</dep>
    </deps>
    <filelist>
      <file role="php" baseinstalldir="XML" name="HTMLSax3.php"/>
      <file role="php" baseinstalldir="XML" name="HTMLSax3/States.php"/>
      <file role="php" baseinstalldir="XML" name="HTMLSax3/Decorators.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/Readme"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/SimpleExample.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/HTMLtoXHTML.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/ExpatvsHtmlSax.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/example.html"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/WordDoc.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/worddoc.htm"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/SimpleTemplate.php"/>
      <file role="doc" baseinstalldir="XML" name="docs/examples/simpletemplate.tpl"/>
      <file role="test" baseinstalldir="XML" name="tests/index.php"/>
      <file role="test" baseinstalldir="XML" name="tests/unit_tests.php"/>
      <file role="test" baseinstalldir="XML" name="tests/xml_htmlsax_test.php"/>
    </filelist>
  </release>
  <changelog>
    <release>
    <version>2.1.2</version>
    <date>2003-12-05</date>
    <license>PHP</license>
    <state>stable</state>
    <notes>* Bug fixed (thanks Jeff) where badly formed attributes resulted in infinite loop
* Added additional boolean argument to open and close handler calls to spot empty tags like br/ - should not break exising APIs
* Added XML_OPTION_FULL_ESCAPES which (when = 1) passes through the complete content in an XML escape, allowing comment / cdata reconstruction</notes>
    </release>
    <release>
      <version>2.1.1</version>
      <date>2003-10-08</date>
      <license>PHP</license>
      <state>stable</state>
      <notes>* Reporting of byte index with get_current_position() more accurate on opening tags (thanks to Alexander Orlov at x-code.com)
* All parser options now available to PHP versions lt 4.3.x, using implementation of html_entity_decode in PHP

</notes>
    </release>
    <release>
      <version>2.1.0</version>
      <date>2003-09-10</date>
      <license>PHP</license>
      <state>stable</state>
      <notes>* Well (unit) tested with SimpleTest

</notes>
    </release>
    <release>
      <version>2.0.2</version>
      <date>2003-08-11</date>
      <license>PHP</license>
      <state>alpha</state>
      <notes>* API is backwards compatible apart from the renaming of parser options
* Performance dramatically increased. Not much slower than Expat
* Better handling of XML comments and CDATA
* Option to trigger additional data handler calls for linefeeds and tabs
* Option to trigger additional data handler calls for XML entities and parse them if required.
* Added public get_current_position() and get_length() methods

</notes>
    </release>
    <release>
      <version>1.1</version>
      <date>2003-06-26</date>
      <license>PHP</license>
      <state>stable</state>
      <notes>* Bug fixes to Attribute_Parser to cope with newline, tag, forward slash and whitespace issues.
</notes>
    </release>
    <release>
      <version>1.0</version>
      <date>2003-06-08</date>
      <state>stable</state>
      <notes>* Modifications to file structure to place Attributes_Parser.php
  and State_Machine.php in subdirectory HTMLSax
* XML_HTMLSax.php includes Attributes_Parser.php and State_Machine.php
  using require_once()

</notes>
    </release>
    <release>
      <version>0.9.0rc2</version>
      <date>2003-05-18</date>
      <state>beta</state>
      <notes>*First release under PEAR
*Changed package name to XML_HTMLSax
*Added patch from John Luxford to parse single quoted attributes
*Modified State_Machine to be a simple variable store



</notes>
    </release>
    <release>
      <version>0.9.0rc1</version>
      <date>2003-05-09</date>
      <state>beta</state>
      <notes>A summary of the main differences between this version
      of HTML_Sax and HTMLSax2002082201 are as follows;
      *Instead of extending HTMLSax with your own &quot;handlers&quot; class,
       you now use the set_object() method to pass an instance of the
       class to HTMLSax.
      *Class method callbacks are specified using the following methods;
      *set_element_handler('startHandler','endHandler') &lt;tag&gt; and &lt;/tag&gt;
      *set_data_handler('dataHandler') for contents of an element
      *set_pi_handler('piHandler') for &lt;?php ?&gt;, &lt;?xml ?&gt; etc.
      *set_escape_handler(') for anything beginning with &lt;!
      *set_jasp_handler() - set listener for &lt;% %&gt; tags
      *Attributes which no value are created and set to true
      *Comments are handled and may contain entities; &lt; &gt;
      *The callback handlers will all be passed an instance of HTMLSax
       in the same way as the native PHP XML Expat extension
      *Setting of parser options is handled specifically by the set_option()
       method. Available options are;
      *skipWhiteSpace; instruct the parser to ignore whitespace characters
      *trimDataNodes; trim whitespace inside character data
      *breakOnNewLine; newline characters found in character data are treated
       as new events triggering another data callback
      *caseFolding; converts element names to uppercase

</notes>
    </release>
  </changelog>
</package>