File: index.rst

package info (click to toggle)
parfive 2.2.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 424 kB
  • sloc: python: 2,014; makefile: 27
file content (129 lines) | stat: -rw-r--r-- 5,934 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
.. currentmodule:: parfive

.. _parfive:

=======
Parfive
=======

Parfive is a small library for downloading files, its objective is to provide a simple API for queuing files for download and then providing excellent feedback to the user about the in progress downloads.
It also aims to provide a clear interface for inspecting any failed downloads.

The parfive package was motivated by the needs of `SunPy's <https://sunpy.org>`__ ``net`` submodule, but should be generally applicable to anyone who wants a user friendly way of downloading multiple files in parallel.
Parfive uses asyncio to support downloading multiple files in parallel, and to support downloading a single file in multiple parallel chunks.
Parfive supports downloading files over either HTTP or FTP using `aiohttp <http://aiohttp.readthedocs.io/>`__ and `aioftp <https://aioftp.readthedocs.io/>`__ (``aioftp`` is an optional dependency, which does not need to be installed to download files over HTTP).

Parfive provides both a function and coroutine interface, so that it can be used from both synchronous and asynchronous code.
It also has opt-in support for using `aiofiles <https://github.com/Tinche/aiofiles>`__ to write downloaded data to disk using a separate thread pool, which may be useful if you are using parfive from within an asyncio application.


Installation
------------

parfive can be installed via pip::

  pip install parfive

or with FTP support::

  pip install parfive[ftp]

or with conda from conda-forge::

  conda install -c conda-forge parfive

or from `GitHub <https://github.com/Cadair/parfive>`__.

Usage
-----

Parfive works by creating a downloader object, queuing downloads with it and then running the download.

A simple example is::

  from parfive import Downloader
  dl = Downloader()
  dl.enqueue_file("http://data.sunpy.org/sample-data/predicted-sunspot-radio-flux.txt", path="./")
  files = dl.download()

It's also possible to download a list of URLs to a single destination using the `Downloader.simple_download <parfive.Downloader.simple_download>` method::

  from parfive import Downloader
  files = Downloader.simple_download(['http://212.183.159.230/5MB.zip' 'http://212.183.159.230/10MB.zip'], path="./")

Parfive also bundles a CLI. The following example will download the two files concurrently::

  $ parfive 'http://212.183.159.230/5MB.zip' 'http://212.183.159.230/10MB.zip'
  $ parfive --help
  usage: parfive [-h] [--max-conn MAX_CONN] [--overwrite] [--no-file-progress] [--directory DIRECTORY] [--print-filenames] URLS [URLS ...]

  Parfive, the python asyncio based downloader

  positional arguments:
    URLS                  URLs of files to be downloaded.

  optional arguments:
    -h, --help            show this help message and exit
    --max-conn MAX_CONN   Number of maximum connections.
    --overwrite           Overwrite if the file exists.
    --no-file-progress    Show progress bar for each file.
    --directory DIRECTORY
                          Directory to which downloaded files are saved.
    --print-filenames     Print successfully downloaded files's names to stdout.


Options and Customisation
-------------------------

Parfive aims to support as many use cases as possible, and therefore has a number of options.

There are two main points where you can customise the behaviour of the downloads, in the initialiser to `parfive.Downloader` or when adding a URL to the download queue with `~parfive.Downloader.enqueue_file`.
The arguments to the ``Downloader()`` constructor affect all files transferred, and the arguments to ``enqueue_file()`` apply to only that file.

By default parfive will transfer 5 files in parallel and, if supported by the remote server, chunk those files and download 5 chunks simultaneously.
This behaviour is controlled by the ``max_conn=`` and ``max_splits=`` keyword arguments.

Further configuration of the ``Downloader`` instance is done by passing in a `parfive.SessionConfig` object as the ``config=`` keyword argument to ``Downloader()``.
See the documentation of that class for more details.

Keyword arguments to `~parfive.Downloader.enqueue_file` are passed through to either `aiohttp.ClientSession.get` for HTTP downloads or `aioftp.Client` for FTP downloads.
This gives you many per-file options such as headers, authentication, ssl options etc.


Parfive API
-----------

.. automodapi:: parfive
   :no-heading:
   :no-main-docstr:

Environment Variables
---------------------

Parfive reads the following environment variables, note that as of version 2.0 all environment variables are read at the point where the ``Downloader()`` class is instantiated.

* ``PARFIVE_SINGLE_DOWNLOAD`` - If set to ``"True"`` this variable sets ``max_conn`` and ``max_splits`` to one; meaning that no parallelisation of the downloads will occur.
* ``PARFIVE_DISABLE_RANGE`` - If set to ``"True"`` this variable will set ``max_splits`` to one; meaning that each file downloaded will only have one concurrent connection, although multiple files may be downloaded simultaneously.
* ``PARFIVE_OVERWRITE_ENABLE_AIOFILES`` - If set to ``"True"`` and aiofiles is installed in the system, aiofiles will be used to write files to disk.
* ``PARFIVE_DEBUG`` - If set to ``"True"`` will configure the built-in Python logger to log to stderr and set parfive, aiohttp and aioftp to debug levels.
* ``PARFIVE_HIDE_PROGESS`` - If set to ``"True"`` no progress bars will be shown.
* ``PARFIVE_TOTAL_TIMEOUT`` - Overrides the default aiohttp ``total`` timeout value (unless set in Python).
* ``PARFIVE_SOCK_READ_TIMEOUT`` - Overrides the default aiohttp ``sock_read`` timeout value (unless set in Python).

Contributors
------------

 * Cadair
 * vn-ki
 * dstansby
 * nabobalis
 * GitHK
 * SolarDrew
 * 1nF0rmed
 * Raahul-Singh
 * rlaker

Changelog
---------

See `GitHub Releases <https://github.com/Cadair/parfive/releases>`__ for the release history and changelog.