1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115
|
### CREDITS ##########################################################################################
# Copyright (c) 2008 Tom De Smedt.
# See LICENSE.txt for details.
__author__ = "Tom De Smedt"
__version__ = "1.9.4.6"
__copyright__ = "Copyright (c) 2008 Tom De Smedt"
__license__ = "GPL"
### NODEBOX WEB LIBRARY #############################################################################
# The NodeBox Web library offers a collection of services to retrieve content from the internet.
# You can use the library to query Yahoo! for links, images, news and spelling suggestions,
# to read RSS and Atom newsfeeds, to retrieve articles from Wikipedia, to collect quality images
# from morgueFile, to get color themes from kuler , to browse through HTML documents, to clean up HTML,
# to validate URL's, to create GIF images from math equations using mimeTeX, to get ironic word
# definitions from Urban Dictionary.
# The NodeBox Web library works with a caching mechanism that stores things you download from the web,
# so they can be retrieved faster the next time. Many of the services also work asynchronously.
# This means you can use the library in an animation that keeps on running while new content is downloaded
# in the background.
# The library bundles Leonard Richardson's BeautifulSoup to parse HTM,
# Mark Pilgrim's Universal Feed Parser for newsfeeds, a connection to John Forkosh's mimeTeX server,
# Leif K-Brooks entity replace algorithm, Bob Ippolito's simplejson.
# Thanks to Serafeim Zanikolas for maintaining Debian compatibility, Stuart Axon for various patches.
######################################################################################################
import os
import cache
import url
import html
import page
import simplejson
packages = [
"yahoo", "google",
"newsfeed",
"wikipedia",
"morguefile", "flickr",
"kuler", "colr",
"mimetex", #deprecated
"mathtex",
"urbandictionary",
]
for p in packages:
try: exec("import %s" % p)
except ImportError:
pass
def set_proxy(host, type="https"):
url.set_proxy(host, type)
set_proxy(None)
def is_url(url_, wait=10):
return url.is_url(url_, wait)
def download(url_, wait=60, cache=None, type=".html"):
return url.retrieve(url_, wait, False, cache, type).data
def save(url_, path="", wait=60):
if hasattr(url_, "url"):
url_ = url_.url
if len(path) < 5 or "." not in path[-5:-3]:
file = url.parse(str(url_)).filename
path = os.path.join(path, file)
open(path, "w").write(download(url_, wait))
return path
def clear_cache():
page.clear_cache()
for p in packages:
try: exec("%s.clear_cache()" % p)
except NameError:
pass
# 1.9.4.6
# cache.py uses hashlib instead of md5 on Python 2.6+
# On Windows, cached files are stored under Documents and Settings\UserName\.nodebox-web-cache.
# Cache files are stored in binary mode to avoid newline issues.
# Fixed support for Morguefile.
# 1.9.4.5
# cache.py closes files after reading and writing.
# This is necessary in Jython.
# 1.9.4.4
# mathTeX deprecates mimeTeX.
# 1.9.4.3
# Flickr accepts Unicode queries.
# 1.9.4.1
# Added set_proxy() command.
# Added Serafeim Zanikolas' patches & examples for Debian.
# Added Serafeim Zanikolas' html=False attribute to WikipediaPage.
# 1.9.4
# Added simplejson for improved unicode support.
# Added google.py module.
# Improvements to html.py.
# Morguefile images can be filtered by size.
# Flickr images can be filtered by size.
# Flickr images can be filtered by interestingness/relevance/date/tags.
# Fixed Flickr unicode bug.
# Wikipedia unicode improvements.
# url.URLAccumulator._done() will only load data if no URLError was raised.
# url.parse() has a new .filename attribute (equals .page).
# Handy web.save() command downloads data and saves it to a given path.
# hex_to_rgb() improvement for hex strings shorter than 6 characters.
# Upgraded to BeautifulSoup 3.0.7a
|