1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93
|
Install Dask
============
You can install dask with ``conda``, with ``pip``, or by installing from source.
Conda
-----
Dask is installed by default in `Anaconda <https://www.anaconda.com/download/>`_.
You can update Dask using the `conda <https://www.anaconda.com/download/>`_ command::
conda install dask
This installs Dask and **all** common dependencies, including Pandas and NumPy.
Dask packages are maintained both on the default channel and on `conda-forge <https://conda-forge.github.io/>`_.
Optionally, you can obtain a minimal Dask installation using the following command::
conda install dask-core
This will install a minimal set of dependencies required to run Dask similar to (but not exactly the same as) ``pip install dask`` below.
Pip
---
You can install everything required for most common uses of Dask (arrays,
dataframes, ...) This installs both Dask and dependencies like NumPy, Pandas,
and so on that are necessary for different workloads. This is often the right
choice for Dask users::
pip install "dask[complete]" # Install everything
You can also install only the Dask library. Modules like ``dask.array``,
``dask.dataframe``, or ``dask.distributed`` won't work until you also install NumPy,
Pandas, or Tornado, respectively. This is common for downstream library
maintainers::
pip install dask # Install only core parts of dask
We also maintain other dependency sets for different subsets of functionality::
pip install "dask[array]" # Install requirements for dask array
pip install "dask[bag]" # Install requirements for dask bag
pip install "dask[dataframe]" # Install requirements for dask dataframe
pip install "dask[distributed]" # Install requirements for distributed dask
We have these options so that users of the lightweight core Dask scheduler
aren't required to download the more exotic dependencies of the collections
(Numpy, Pandas, Tornado, etc.).
Install from Source
-------------------
To install Dask from source, clone the repository from `github
<https://github.com/dask/dask>`_::
git clone https://github.com/dask/dask.git
cd dask
python setup.py install
or use ``pip`` locally if you want to install all dependencies as well::
pip install -e ".[complete]"
You can view the list of all dependencies within the ``extras_require`` field
of ``setup.py``.
Anaconda
--------
Dask is included by default in the `Anaconda distribution <https://www.anaconda.com/download>`_.
Test
----
Test Dask with ``py.test``::
cd dask
py.test dask
Please be aware that installing Dask naively may not install all
requirements by default. Please read the ``pip`` section above which discusses
requirements. You may choose to install the ``dask[complete]`` version which includes
all dependencies for all collections. Alternatively, you may choose to test
only certain submodules depending on the libraries within your environment.
For example, to test only Dask core and Dask array we would run tests as
follows::
py.test dask/tests dask/array/tests
|