The example programs included with the HTML Parser distribution are listed below, with some details.
Note: On unix systems if you used the Java jar command or some older unzip utility to extract the distribution zip file, the executable flag will not have been preserved on the files in the bin directory. You can fix this by issuing the following command:
chmod u+x bin/*
Parser |
Parse a web page and print the tags in a simple loop. org.htmlparser.Parser.main(String[] args)
|
Lexer |
Print the low level nodes of a web page. org.htmlparser.lexer.Lexer
|
Filter Builder |
Interactively generate source code to extract web site contents. org.htmlparser.parserapplications.filterbuilder.FilterBuilder
|
Link Extractor |
Extract links/mail addresses from a web page. org.htmlparser.parserapplications.LinkExtractor
|
String Extractor |
Extract text from a web page. org.htmlparser.parserapplications.StringExtractor
|
Site Capturer |
Save a web site locally. org.htmlparser.parserapplications.SiteCapturer
|
Thumbelina |
View images behind thumbnails. org.htmlparser.lexerapplications.thumbelina.Thumbelina
|
BeanyBaby |
Parser Java Bean demo. org.htmlparser.beans.BeanyBaby
|
Translate |
Numeric character reference and character entity reference to unicode codec. org.htmlparser.util.Translate
|