File: array-gufunc.rst

package info (click to toggle)
dask 2024.12.1%2Bdfsg-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 20,024 kB
  • sloc: python: 105,182; javascript: 1,917; makefile: 159; sh: 88
file content (92 lines) | stat: -rw-r--r-- 2,857 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
Generalized Ufuncs
==================

`NumPy <https://www.numpy.org>`_ provides the concept of `generalized ufuncs <https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html>`_. Generalized ufuncs are functions
that distinguish the various dimensions of passed arrays in the two classes loop dimensions
and core dimensions. To accomplish this, a `signature <https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html#details-of-signature>`_ is specified for NumPy generalized ufuncs.

`Dask <https://dask.org/>`_ integrates interoperability with NumPy's generalized ufuncs
by adhering to respective `ufunc protocol <https://docs.scipy.org/doc/numpy/reference/arrays.classes.html#numpy.class.__array_ufunc__>`_, and provides a wrapper to make a Python function a generalized ufunc.


Usage
-----

NumPy Generalized UFuncs
~~~~~~~~~~~~~~~~~~~~~~~~
.. note::

    `NumPy <https://www.numpy.org>`_ generalized ufuncs are currently (v1.14.3 and below) stored in
    inside ``np.linalg._umath_linalg`` and might change in the future.


.. code-block:: python

    import dask.array as da
    import numpy as np

    x = da.random.default_rng().normal(size=(3, 10, 10), chunks=(2, 10, 10))

    w, v = np.linalg._umath_linalg.eig(x, output_dtypes=(float, float))


Create Generalized UFuncs
~~~~~~~~~~~~~~~~~~~~~~~~~

It can be difficult to create your own GUFuncs without going into the CPython API.
However, the `Numba <https://numba.pydata.org>`_ project does provide a
nice implementation with their ``numba.guvectorize`` decorator.  See `Numba's
documentation
<https://numba.pydata.org/numba-doc/dev/user/vectorize.html#the-guvectorize-decorator>`_
for more information.

Wrap your own Python function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``gufunc`` can be used to make a Python function behave like a generalized ufunc:


.. code-block:: python

    x = da.random.default_rng().normal(size=(10, 5), chunks=(2, 5))

    def foo(x):
        return np.mean(x, axis=-1)

    gufoo = da.gufunc(foo, signature="(i)->()", output_dtypes=float, vectorize=True)

    y = gufoo(x)


Instead of ``gufunc``, also the ``as_gufunc`` decorator can be used for convenience:


.. code-block:: python

    x = da.random.normal(size=(10, 5), chunks=(2, 5))

    @da.as_gufunc(signature="(i)->()", output_dtypes=float, vectorize=True)
    def gufoo(x):
        return np.mean(x, axis=-1)

    y = gufoo(x)


Disclaimer
----------
This experimental generalized ufunc integration is not complete:

* ``gufunc`` does not create a true generalized ufunc to be used with other input arrays besides Dask.
  I.e., at the moment, ``gufunc`` casts all input arguments to ``dask.array.Array``

* Inferring ``output_dtypes`` automatically is not implemented yet


API
---

.. currentmodule:: dask.array.gufunc

.. autosummary::
   apply_gufunc
   as_gufunc
   gufunc