File: api.rst

package info (click to toggle)
python-internetarchive 3.3.0-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 916 kB
  • sloc: python: 6,108; makefile: 180; xml: 180
file content (206 lines) | stat: -rw-r--r-- 5,800 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
.. _api:

Developer Interface
===================

.. module:: internetarchive

Configuration
-------------

Certain functions of the internetarchive library require your archive.org credentials (i.e. uploading, modifying metadata, searching).
Your credentials and other configurations can be provided via a dictionary when instantiating an :class:`ArchiveSession` or :class:`Item` object, or in a config file.

The easiest way to create a config file is with the `configure <internetarchive.html#internetarchive.configure>`_ function::

    >>> from internetarchive import configure
    >>> configure('user@example.com', 'password')

Config files are stored in either ``$HOME/.ia`` or ``$HOME/.config/ia.ini`` by default. You can also specify your own path::


    >>> from internetarchive import configure
    >>> configure('user@example.com', 'password', config_file='/home/jake/.config/ia-alternate.ini')

Custom config files can be specified when instantiating an :class:`ArchiveSession` object::

    >>> from internetarchive import get_session
    >>> s = get_session(config_file='/home/jake/.config/ia-alternate.ini')

Or an :class:`Item` object::

    >>> from internetarchive import get_item
    >>> item = get_item('nasa', config_file='/home/jake/.config/ia-alternate.ini')

IA-S3 Configuration
~~~~~~~~~~~~~~~~~~~

Your IA-S3 keys are required for uploading and modifying metadata.
You can retrieve your IA-S3 keys at https://archive.org/account/s3.php.

They can be specified in your config file like so::

    [s3]
    access = mYaccEsSkEY
    secret = mYs3cREtKEy

Or, using the :class:`ArchiveSession` object::

    >>> from internetarchive import get_session
    >>> c = {'s3': {'access': 'mYaccEsSkEY', 'secret': 'mYs3cREtKEy'}}
    >>> s = get_session(config=c)
    >>> s.access_key
    'mYaccEsSkEY'

Cookie Configuration
~~~~~~~~~~~~~~~~~~~~

Your archive.org logged-in cookies are required for downloading access-restricted files that you have permissions to and retrieving information about archive.org catalog tasks.

Your cookies can be specified like so::

    [cookies]
    logged-in-user = user%40example.com
    logged-in-sig = <redacted>

Or, using the :class:`ArchiveSession` object::

    >>> from internetarchive import get_session
    >>> c = {'cookies': {'logged-in-user': 'user%40example.com', 'logged-in-sig': 'foo'}}
    >>> s = get_session(config=c)
    >>> s.cookies['logged-in-user']
    'user%40example.com'


Logging Configuration
~~~~~~~~~~~~~~~~~~~~~

You can specify logging levels and the location of your log file like so::

    [logging]
    level = INFO
    file = /tmp/ia.log

Or, using the :class:`ArchiveSession` object::

    >>> from internetarchive import get_session
    >>> c = {'logging': {'level': 'INFO', 'file': '/tmp/ia.log'}}
    >>> s = get_session(config=c)

By default logging is turned off.

Other Configuration
~~~~~~~~~~~~~~~~~~~

By default all requests are HTTPS.
You can change this setting in your config file in the ``general`` section::

    [general]
    secure = False

Or, using the :class:`ArchiveSession` object::

    >>> from internetarchive import get_session
    >>> s = get_session(config={'general': {'secure': False}})

In the example above, all requests will be made via HTTP.


ArchiveSession Objects
----------------------
The ArchiveSession object is subclassed from :class:`requests.Session`.
It collects together your credentials and config.

.. autofunction:: get_session


Item Objects
------------

:class:`Item` objects represent `Internet Archive items <//archive.org/services/docs/api/items.html>`_.
From the :class:`Item` object you can create new items, upload files to existing items, read and write metadata, and download or delete files.

.. autofunction:: get_item

Uploading
~~~~~~~~~

Uploading to an item can be done using :meth:`Item.upload`::

    >>> item = get_item('my_item')
    >>> r = item.upload('/home/user/foo.txt')

Or :func:`internetarchive.upload`::

    >>> from internetarchive import upload
    >>> r = upload('my_item', '/home/user/foo.txt')

The item will automatically be created if it does not exist.

Refer to `archive.org Identifiers <//archive.org/services/docs/api/metadata-schema/index.html#archive-org-identifiers>`_ for more information on creating valid archive.org identifiers.

Setting Remote Filenames
^^^^^^^^^^^^^^^^^^^^^^^^

Remote filenames can be defined using a dictionary::

    >>> from io import BytesIO
    >>> fh = BytesIO()
    >>> fh.write(b'foo bar')
    >>> item.upload({'my-remote-filename.txt': fh})


.. autofunction:: upload

Metadata
~~~~~~~~

.. autofunction:: modify_metadata

The default target to write to is ``metadata``.
If you would like to write to another target, such as ``files``, you can specify so using the ``target`` parameter.
For example, if we had an item whose identifier was ``my_identifier`` and you wanted to add a metadata field to a file within the item called foo.txt::

    >>> r = modify_metadata('my_identifier', metadata={'title': 'My File'}, target='files/foo.txt')
    >>> from internetarchive import get_files
    >>> f = list(get_files('iacli-test-item301', 'foo.txt'))[0]
    >>> f.title
    'My File'

You can also create new targets if they don’t exist::

    >>> r = modify_metadata('my_identifier', metadata={'foo': 'bar'}, target='extra_metadata')
    >>> from internetarchive import get_item
    >>> item = get_item('my_identifier')
    >>> item.item_metadata['extra_metadata']
    {'foo': 'bar'}


Downloading
~~~~~~~~~~~

.. autofunction:: download


Deleting
~~~~~~~~

.. autofunction:: delete


File Objects
~~~~~~~~~~~~

.. autofunction:: get_files


Searching Items
---------------


.. autofunction:: search_items


Internet Archive Tasks
----------------------
.. autofunction:: get_tasks