1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
|
What's IMDbPY?
==============
NOTE: see also the recommendations in the "DISCLAIMER.txt" file.
IMDbPY is a Python package useful to retrieve and manage the data of
the IMDb movie database.
IMDbPY is mainly a tool intended for programmers and developers, but
some example scripts are included.
If you're a poor, simple, clueless user, read the "README.users" file. :-)
Seriously: take a look at the provided example scripts even if you're
a Really Mighty Programmer(tm), they should clearly show how to use IMDbPY.
Other IMDbPY-based programs can be downloaded from:
http://imdbpy.sourceforge.net/?page=programs
If you want to develop a program/script/package/framework using the
IMDbPY package, see the "README.package" file, for instructions about
how to use this package.
If you're installing IMDbPY in a smart phone, PDA or hand-held system,
read the "README.mobile" file.
If you're crazy enough and/or you've realized that your higher
inspiration in life is to help the development of IMDbPY, begin reading
the "README.devel" file. ;-)
INSTALLATION
============
Everything you need to do is to run, as the root user, the command:
# python setup.py install
IMDbPY itself can be installed through easy_install and pip,
with - respectively - these commands (as root):
easy_install IMDbPY
pip install IMDbPY
Using easy_install and pip, the dependencies will be automatically
satisfied. Third-party packages may be downloaded, and if not
otherwise specified (see below), C extensions compiled (this means
that you need the python-dev package installed).
If, for some reason, it doesn't work, you can copy the "./imdb"
directory in the local site-packages directory of the python
major version you're using, but remember that you'll not satisfy
the required dependencies and neither compile the optional C module,
so use this as your very last resort.
To know what major version of python you've installed, run:
$ python -V
It should return a string like "Python 2.6.1"; in this example
the major version is "2.6".
Now copy the "./imdb" directory:
# cp -r ./imdb /usr/local/lib/python{MAJORVERSION}/site-packages/
The setup.py contains some configuration options that could
be useful if you're installing IMDbPY in a system with very
little hard disk space (like an handheld device) or where
you've not a complete development environment available;
read the "README.mobile" file.
If you want to insert the content of the plain text data files
into a SQL database, read the "README.sqldb" file.
The whole list of command line options of the setup.py script is:
--without-lxml exclude lxml (speeds up "http" considerably,
so try to fix it).
--without-cutils don't compile the C module (speeds up 'sql')
--without-sql no access to SQL databases.
If you're install 'sql', setup.py tries to install BOTH SQLObject
and SQLAlchemy. In fact, having one of them will be enough.
You can exclude the unwanted one with:
--without-sqlobject exclude SQLObject
--without-sqlalchemy exclude SQLAlchemy
If you specify both, --without-sql is implied.
Mercurial VERSION
=================
The best thing is always to use a package for your distribution,
or use easy_install or pip to install the latest release, but it
goes without saying that sometimes you need the very latest version
(keep in mind that the IMDb site is a moving target...).
In this case, you can always use the Mercurial version, available here:
http://imdbpy.sourceforge.net/?page=download#hg
HELP
====
Refer to the web site http://imdbpy.sf.net/ and subscribe to the
mailing list: http://imdbpy.sf.net/?page=help#ml
NOTES FOR PACKAGERS
===================
If you plan to package IMDbPY for your distribution/operating system,
keep in mind that, while IMDbPY can works out-of-the-box, some external
package may be required for certain functionality:
- python-lxml: the 'http' data access system will be much faster, if
it's installed.
- SQLObject or SQLAlchemy: one of these is REQUIRED if you want to use
the 'sql' data access system.
All of them should probably be "recommended" (or at least "suggested")
dependencies.
To compile the C module, you also need the python-dev package.
As of IMDbPY 4.0, the installer is based on setuptools.
RECENT IMPORTANT CHANGES
========================
Since release 2.4, IMDbPY internally manages every information about
movies and people using unicode strings. Please read the README.utf8 file.
Since release 3.3, IMDbPY supports IMDb's character pages; see the
README.currentRole file for more information.
Since release 3.6, IMDbPY supports IMDb's company pages; see the
README.companies file for more information.
Since release 3.7, IMDbPY has moved its main parsers from a SAX-based
approach to a DOM/XPath-based one; see the README.newparsers file
for more information.
Since release 3.8, IMDbPY supports both SQLObject and SQLAlchemy; see
README.sqldb for more information.
Since release 3.9 support dumping the plain text data files in CSV files;
see README.sqldb for more information.
Since release 4.0 it's possible to search for keywords (get keywords
similar to a given one and get a list of movies for a specified keyword).
See README.keywords for more information.
Moreover, it's possible to get information out of Movie, Person, Character
and Company instances as XML (getting a single keys or the representation
of a whole object).
See README.info2xml for more information.
Another new feature, is the ability to get top250 and bottom100 lists;
see the "TOP250 / BOTTOM100 LISTS" section of the README.package file
for more information.
Since release 4.1 a DTD for the XML output is available (see
imdbpyXY.dtd). Other important features are locale (i18n) support (see
README.locale) and support for the new style of movie titles used by IMDb
(now in the "The Title" style, and no more as "Title, The").
FEATURES
========
So far you can search for a movie with a given title, a person
with a given name, a character you've seen in a movie or a company, and retrieve
information for a given movie, person, character or company; the supported data
access systems are 'http' (i.e.: the data are fetched through the IMDb's
web server http://akas.imdb.com) and 'sql', meaning that the data are
taken from a SQL database, populated (using the imdbpy2sql.py script) with
data taken from the plain text data files; see
http://www.imdb.com/interfaces/ for more information.
For mobile systems there's the 'mobile' data access system, useful
for PDA, hand-held devices and smart phones.
Another data access system is 'httpThin', which is equal to 'http'
but fetch less data and so it is (or at least it tries to be)
suitable for systems with limited bandwidth but normal CPU power.
FEATURES OF THE HTTP DATA ACCESS SYSTEM
=======================================
* Returns almost every available information about a movie, person or
character.
* The use of the "akas" server will provide access to a lot of
AKA titles in many languages, so it's really useful if English is
not your native language.
* By default includes adult titles (and people who have worked
only/mostly in adult movies) in the results of a title/name search; this
behavior can be changed with the do_adult_search() method; please
read the "README.adult" file.
* You can set/use a proxy to access the web; if set, the HTTP_PROXY
environment variable will be automatically used, otherwise you can set a
proxy with the set_proxy() method of the class returned by the
imdb.IMDb function; obviously this method is available only for the http
data access system, since it's defined in the IMDbHTTPAccessSystem class
of the parser.http package.
Example:
from imdb import IMDb
i = IMDb(accessSystem='http') # the accessSystem argument is not really
# needed, since "http" is the default.
i.set_proxy('http://localhost:8080/')
You can force a direct connection to the net setting the proxy
to a null value (i.e.: i.set_proxy('')).
FEATURES OF THE SQL DATA ACCESS SYSTEM
======================================
* Returns every information available in the plain text data files.
* Every database supported by SQLObject and SQLAlchemy is available.
FEATURES OF THE MOBILE DATA ACCESS SYSTEM
=========================================
* Very lightweight, returns almost every needed information.
* Accessories data sets (like 'goofs', 'trivia' and so on) are always
available (being a subclass of the 'http' data access system).
|