1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206
|
.. _api:
Developer Interface
===================
.. module:: internetarchive
Configuration
-------------
Certain functions of the internetarchive library require your archive.org credentials (i.e. uploading, modifying metadata, searching).
Your credentials and other configurations can be provided via a dictionary when instantiating an :class:`ArchiveSession` or :class:`Item` object, or in a config file.
The easiest way to create a config file is with the `configure <internetarchive.html#internetarchive.configure>`_ function::
>>> from internetarchive import configure
>>> configure('user@example.com', 'password')
Config files are stored in either ``$HOME/.ia`` or ``$HOME/.config/ia.ini`` by default. You can also specify your own path::
>>> from internetarchive import configure
>>> configure('user@example.com', 'password', config_file='/home/jake/.config/ia-alternate.ini')
Custom config files can be specified when instantiating an :class:`ArchiveSession` object::
>>> from internetarchive import get_session
>>> s = get_session(config_file='/home/jake/.config/ia-alternate.ini')
Or an :class:`Item` object::
>>> from internetarchive import get_item
>>> item = get_item('nasa', config_file='/home/jake/.config/ia-alternate.ini')
IA-S3 Configuration
~~~~~~~~~~~~~~~~~~~
Your IA-S3 keys are required for uploading and modifying metadata.
You can retrieve your IA-S3 keys at https://archive.org/account/s3.php.
They can be specified in your config file like so::
[s3]
access = mYaccEsSkEY
secret = mYs3cREtKEy
Or, using the :class:`ArchiveSession` object::
>>> from internetarchive import get_session
>>> c = {'s3': {'access': 'mYaccEsSkEY', 'secret': 'mYs3cREtKEy'}}
>>> s = get_session(config=c)
>>> s.access_key
'mYaccEsSkEY'
Cookie Configuration
~~~~~~~~~~~~~~~~~~~~
Your archive.org logged-in cookies are required for downloading access-restricted files that you have permissions to and retrieving information about archive.org catalog tasks.
Your cookies can be specified like so::
[cookies]
logged-in-user = user%40example.com
logged-in-sig = <redacted>
Or, using the :class:`ArchiveSession` object::
>>> from internetarchive import get_session
>>> c = {'cookies': {'logged-in-user': 'user%40example.com', 'logged-in-sig': 'foo'}}
>>> s = get_session(config=c)
>>> s.cookies['logged-in-user']
'user%40example.com'
Logging Configuration
~~~~~~~~~~~~~~~~~~~~~
You can specify logging levels and the location of your log file like so::
[logging]
level = INFO
file = /tmp/ia.log
Or, using the :class:`ArchiveSession` object::
>>> from internetarchive import get_session
>>> c = {'logging': {'level': 'INFO', 'file': '/tmp/ia.log'}}
>>> s = get_session(config=c)
By default logging is turned off.
Other Configuration
~~~~~~~~~~~~~~~~~~~
By default all requests are HTTPS.
You can change this setting in your config file in the ``general`` section::
[general]
secure = False
Or, using the :class:`ArchiveSession` object::
>>> from internetarchive import get_session
>>> s = get_session(config={'general': {'secure': False}})
In the example above, all requests will be made via HTTP.
ArchiveSession Objects
----------------------
The ArchiveSession object is subclassed from :class:`requests.Session`.
It collects together your credentials and config.
.. autofunction:: get_session
Item Objects
------------
:class:`Item` objects represent `Internet Archive items <//archive.org/services/docs/api/items.html>`_.
From the :class:`Item` object you can create new items, upload files to existing items, read and write metadata, and download or delete files.
.. autofunction:: get_item
Uploading
~~~~~~~~~
Uploading to an item can be done using :meth:`Item.upload`::
>>> item = get_item('my_item')
>>> r = item.upload('/home/user/foo.txt')
Or :func:`internetarchive.upload`::
>>> from internetarchive import upload
>>> r = upload('my_item', '/home/user/foo.txt')
The item will automatically be created if it does not exist.
Refer to `archive.org Identifiers <//archive.org/services/docs/api/metadata-schema/index.html#archive-org-identifiers>`_ for more information on creating valid archive.org identifiers.
Setting Remote Filenames
^^^^^^^^^^^^^^^^^^^^^^^^
Remote filenames can be defined using a dictionary::
>>> from io import BytesIO
>>> fh = BytesIO()
>>> fh.write(b'foo bar')
>>> item.upload({'my-remote-filename.txt': fh})
.. autofunction:: upload
Metadata
~~~~~~~~
.. autofunction:: modify_metadata
The default target to write to is ``metadata``.
If you would like to write to another target, such as ``files``, you can specify so using the ``target`` parameter.
For example, if we had an item whose identifier was ``my_identifier`` and you wanted to add a metadata field to a file within the item called foo.txt::
>>> r = modify_metadata('my_identifier', metadata={'title': 'My File'}, target='files/foo.txt')
>>> from internetarchive import get_files
>>> f = list(get_files('iacli-test-item301', 'foo.txt'))[0]
>>> f.title
'My File'
You can also create new targets if they don’t exist::
>>> r = modify_metadata('my_identifier', metadata={'foo': 'bar'}, target='extra_metadata')
>>> from internetarchive import get_item
>>> item = get_item('my_identifier')
>>> item.item_metadata['extra_metadata']
{'foo': 'bar'}
Downloading
~~~~~~~~~~~
.. autofunction:: download
Deleting
~~~~~~~~
.. autofunction:: delete
File Objects
~~~~~~~~~~~~
.. autofunction:: get_files
Searching Items
---------------
.. autofunction:: search_items
Internet Archive Tasks
----------------------
.. autofunction:: get_tasks
|