File: array-stats.rst

package info (click to toggle)
dask 2024.12.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 20,024 kB
  • sloc: python: 105,182; javascript: 1,917; makefile: 159; sh: 88
file content (38 lines) | stat: -rw-r--r-- 1,140 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Stats
=====

Dask Array implements a subset of the `scipy.stats`_ package.

Statistical Functions
---------------------

You can calculate various measures of an array including skewness, kurtosis, and arbitrary moments.

.. code-block:: python

   >>> from dask.array import stats
   >>> rng = da.random.default_rng()
   >>> x = rng.beta(1, 1, size=(1000,), chunks=10)
   >>> k, s, m = [stats.kurtosis(x), stats.skew(x), stats.moment(x, 5)]
   >>> dask.compute(k, s, m)
   (1.7612340817172787, -0.064073498030693302, -0.00054523780628304799)


Statistical Tests
-----------------

You can perform basic statistical tests on Dask arrays.
Each of these tests return a ``dask.delayed`` wrapping one of the scipy ``namedtuple``
results.


.. code-block:: python

   >>> rng = da.random.default_rng()
   >>> a = rng.uniform(size=(50,), chunks=(25,))
   >>> b = a + rng.uniform(low=-0.15, high=0.15, size=(50,), chunks=(25,))
   >>> result = stats.ttest_rel(a, b)
   >>> result.compute()
   Ttest_relResult(statistic=-1.5102104380013242, pvalue=0.13741197274874514)

.. _scipy.stats: https://docs.scipy.org/doc/scipy-0.19.0/reference/stats.html