1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
|
html2text
THE ASCIINATOR
html2text is a Python script that converts a page of HTML into clean,
easy-to-read plain ASCII text. Better yet, that ASCII also happens to
be valid [1]Markdown (a text-to-HTML format).
Also known as: html to text, htm to txt, htm2txt, ...
Try
Enter the address of the web page you'd like to convert.
URL: ____________________ Convert
Example sites: [2]aaronsw.com, [3]daringfireball.net.
Bookmarklet: [4]2text
Buy
html2text is available under the GNU GPL 3.0.
Download the latest: [5]html2text.py
History
2010-02-03: [6]2.38. package properly (tx Michael Jenny, Vincent
Fretin)
2009-09-14: [7]2.37. don't use stdout by default (tx Greg Brown)
warning: may not be backwards-compatible in some odd use cases
2009-08-10: [8]2.36. relative url resolution (tx Kevin North)
2008-11-20: [9]2.35. undo last change (tx Sumit Rangwala)
2008-10-09: [10]2.34. elim extra \ns (tx Keith Bussell)
2008-09-19: [11]2.33. add support for abbr (tx Nathan Youngman)
2008-07-31: [12]2.32. fix parsing bug with fastcompany (tx Elias Soong)
2008-07-23: [13]2.31. fix unicode support (tx John Chapman)
2008-05-26: [14]2.3. prelim JS support, various fixes, improved
performances (tx Johannes Fitz)
2008-05-13: [15]2.292. add SKIP_INTERNAL_LINKS (tx Christian Siefkes)
2008-04-25: [16]2.291. add shbang, fix wrapping (tx Christian Siefkes)
2007-11-01: [17]2.29. fix degenerate sites (cough 9rules) that don't
close head tags; fix crash when feedparser wasn't available (tx Johann
Burkard)
2007-04-12: [18]2.28. fix tables (tx Pete Savage)
2007-04-09: [19]2.27. fix line breaks (tx Danny O'Brien)
2007-02-23: [20]2.26. input unicode better (tx John Cavanaugh for the
push)
2006-10-13: [21]2.25. output unicode better (tx s s)
2006-02-22: [22]2.24. preliminary support for dt/dd
????-??-??: [23]2.23. fix for python2.1
2004-08-27: [24]2.21. old bug with extra closing list tags (tx
Jonathan)
2004-08-26: [25]2.2. text wrapping (tx++ Joey Schulze!), supress dupe
links (tx Ricardo Reyes), python2.1 support.
2004-08-23: [26]2.12. added hr (tx merlin)
2004-06-30: [27]2.11. python2.1 codec support.
2004-06-27: [28]2.1. better module, unicode support. expand ndash.
2004-03-27: [29]2.01a. fix bug w/ charrefs in links. tx Ian G.
2004-03-19: [30]2.0a. complete rewrite, supports Markdown
2003-03-16: [31]1.0. port to Python
2000-06-19: html2text.tcl (with Lars Pind)
[32]Aaron Swartz ([33]me@aaronsw.com)
References
1. http://daringfireball.net/projects/markdown/
2. http://www.aaronsw.com/2002/html2text/?url=http://www.aaronsw.com/
3. http://www.aaronsw.com/2002/html2text/?url=http://daringfireball.net/
4. javascript:location.href='http://www.aaronsw.com/2002/html2text/?url='+document.location.href;
5. http://www.aaronsw.com/2002/html2text/html2text.py
6. http://www.aaronsw.com/2002/html2text/html2text-2.38.py
7. http://www.aaronsw.com/2002/html2text/html2text-2.37.py
8. http://www.aaronsw.com/2002/html2text/html2text-2.36.py
9. http://www.aaronsw.com/2002/html2text/html2text-2.35.py
10. http://www.aaronsw.com/2002/html2text/html2text-2.34.py
11. http://www.aaronsw.com/2002/html2text/html2text-2.33.py
12. http://www.aaronsw.com/2002/html2text/html2text-2.32.py
13. http://www.aaronsw.com/2002/html2text/html2text-2.31.py
14. http://www.aaronsw.com/2002/html2text/html2text-2.3.py
15. http://www.aaronsw.com/2002/html2text/html2text-2.292.py
16. http://www.aaronsw.com/2002/html2text/html2text-2.291.py
17. http://www.aaronsw.com/2002/html2text/html2text-2.29.py
18. http://www.aaronsw.com/2002/html2text/html2text-2.28.py
19. http://www.aaronsw.com/2002/html2text/html2text-2.27.py
20. http://www.aaronsw.com/2002/html2text/html2text-2.26.py
21. http://www.aaronsw.com/2002/html2text/html2text-2.25.py
22. http://www.aaronsw.com/2002/html2text/html2text-2.24.py
23. http://www.aaronsw.com/2002/html2text/html2text-2.23.py
24. http://www.aaronsw.com/2002/html2text/html2text-2.21.py
25. http://www.aaronsw.com/2002/html2text/html2text-2.2.py
26. http://www.aaronsw.com/2002/html2text/html2text-2.12.py
27. http://www.aaronsw.com/2002/html2text/html2text-2.11.py
28. http://www.aaronsw.com/2002/html2text/html2text-2.1.py
29. http://www.aaronsw.com/2002/html2text/html2text-2.01a.py
30. http://www.aaronsw.com/2002/html2text/html2text-2.0a.py
31. http://www.aaronsw.com/2002/html2text/html2text-1.0.py
32. http://www.aaronsw.com/
33. mailto:me@aaronsw.com
|