File: README

package info (click to toggle)

libhtml-parser-perl 3.69-2

links: PTS, VCS
area: main
in suites: wheezy
size: 672 kB
sloc: perl: 2,022; ansic: 1,972; makefile: 6

file content (23 lines) | stat: -rw-r--r-- 1,085 bytes

parent folder | download | duplicates (3)

For most of these scripts if you run them with a file argument, where the file
contains some HTML, you should get some output. The 'h*sub' scripts take two
arguments the first of which is a perl expression and the second an HTML file.
In any case all of the files have an exlanatory comment.

For example try running:

lynx -dump -source -raw http://www.debian.org > /tmp/a.txt
./hanchors /tmp/a.txt

Of course if http://www.debian.org is not your favourite web site you can
make the appropriate substitution.

hanchors        - List all anchors in the HTML
hlc             - Correct any upper case tags to lower case
hstrip          - Removes deprecated scripting and styling tags and attributes
htextsub        - Apply arbirary perl expression to all text within HTML
hrefsub         - Apply arbirary perl expression to all hrefs within HTML
htitle          - Print title of the HTML document
hdump           - Output event information whilst parsing HTML document
hform           - Print analysis of form controls present in HTML
htext           - Print all the text from the HTML