1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
|
==========
html5lib
==========
html5lib is a pure-python library for parsing HTML. It is designed to
conform to the HTML 5 specification, which has formalized the error handling
algorithms of popular web browsers.
Installation
============
The best way to install html5lib is using pip e.g.
::
pip install html5lib
You can also use the traditional
::
python setup.py install
Tests
=====
You may wish to check that your installation has been a success by
running the testsuite. The tests are run using the nosetests tool::
python setup.py nosetests
Usage
=====
Simple usage follows this pattern::
import html5lib
f = open("mydocument.html")
document = html5lib.parse(f)
More documentation is avaliable in the docstrings or from
http://code.google.com/p/html5lib/wiki/UserDocumentation
Bugs
====
Please report any bugs on the issue tracker:
http://code.google.com/p/html5lib/issues/list
Get Involved
============
Contributions to code or documenation are actively encouraged. Submit
patches to the issue tracker or discuss changes on irc in the #whatwg
channel on freenode.net
|