File: README

package info (click to toggle)
libhtml-parser-perl 3.69-2
  • links: PTS, VCS
  • area: main
  • in suites: wheezy
  • size: 672 kB
  • sloc: perl: 2,022; ansic: 1,972; makefile: 6
file content (23 lines) | stat: -rw-r--r-- 1,085 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
For most of these scripts if you run them with a file argument, where the file
contains some HTML, you should get some output. The 'h*sub' scripts take two
arguments the first of which is a perl expression and the second an HTML file.
In any case all of the files have an exlanatory comment.

For example try running:

lynx -dump -source -raw http://www.debian.org > /tmp/a.txt
./hanchors /tmp/a.txt

Of course if http://www.debian.org is not your favourite web site you can
make the appropriate substitution.

hanchors        - List all anchors in the HTML
hlc             - Correct any upper case tags to lower case
hstrip          - Removes deprecated scripting and styling tags and attributes
htextsub        - Apply arbirary perl expression to all text within HTML
hrefsub         - Apply arbirary perl expression to all hrefs within HTML
htitle          - Print title of the HTML document
hdump           - Output event information whilst parsing HTML document
hform           - Print analysis of form controls present in HTML
htext           - Print all the text from the HTML