File: transforms.rst

package info (click to toggle)
python-boost-histogram 1.7.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 2,236 kB
  • sloc: python: 7,940; cpp: 3,243; makefile: 22; sh: 1
file content (199 lines) | stat: -rw-r--r-- 6,196 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
.. _usage-transforms:

Using Transforms
================

The boost-histogram library provides a powerful transform system on Regular axes that allows
you to provide a functional form for the conversion between a regular spacing and the actual
bin edges. The following transforms are built in:


* ``bh.axis.transform.sqrt``: A square root transform
* ``bh.axis.transform.log``: A logarithmic transform
* ``bh.axis.transform.Pow(power)`` Raise to a specified power (``power=0.5`` is identical to ``sqrt``)

There is also a flexible ``bh.axis.transform.Function``, which allows you to specify arbitrary conversion functions (detailed below).


Simple custom transforms
------------------------


The ``Function`` transform takes two ctypes ``double(double)`` function pointers, a forward transform and a inverse transform. An object that provides a ctypes function pointer through a ``.ctypes`` attribute is supported, as well. As an example, let's look at how one would recreate the ``log`` transform using several different methods:

Pure Python
^^^^^^^^^^^

You can directly cast a python callable to a ctypes pointer, and use that. However, you will call Python *every* time you interact with the
transformed axis, and this will be 15-90 times slower than a compiled method, like ``bh.axis.transform.log``. In most cases, a Variable axis will be faster.

.. code-block:: python

   import ctypes

   ftype = ctypes.CFUNCTYPE(ctypes.c_double, ctypes.c_double)

   # Pure Python (15x slower)
   bh.axis.Regular(
       10, 1, 4, transform=bh.axis.transform.Function(ftype(math.log), ftype(math.exp))
   )

   # Pure Python: NumPy (90x slower)
   bh.axis.Regular(
       10, 1, 4, transform=bh.axis.transform.Function(ftype(np.log), ftype(np.exp))
   )

You can create a Variable axis from the edges of this axis; often that will be faster.

You can also use ``transform=ftype`` and just directly provide the functions; this provides nicer
reprs, but is still not picklable because ftype is a generated and not picklable; see below
for a way to make this picklable. You can also specify ``name="..."`` to customize the repr explicitly.

Using Numba
^^^^^^^^^^^

If you have the numba library installed, and your transform is reasonably simple, you can use the ``@numba.cfunc`` decorator to create
a callable that will run directly through the C interface. This is just as fast as the compiled version provided!

.. code-block:: python

   import numba


   @numba.cfunc(numba.float64(numba.float64))
   def exp(x):
       return math.exp(x)


   @numba.cfunc(numba.float64(numba.float64))
   def log(x):
       return math.log(x)


   bh.axis.Regular(10, 1, 4, transform=bh.axis.transform.Function(log, exp))

Manual compilation
^^^^^^^^^^^^^^^^^^

You can also get a ctypes pointer from the usual place: a library. Let's say you have the following ``mylib.c`` file:

.. code-block:: c

   #include <math.h>

   double my_log(double value) {
       return log(value);
   }

   double my_exp(double value) {
       return exp(value);
   }


And you compile it with:

.. code-block:: bash

   gcc mylib.c -shared -o mylib.so

You can now use it like this:

.. code-block:: python

   import ctypes

   ftype = ctypes.CFUNCTYPE(ctypes.c_double, ctypes.c_double)

   mylib = ctypes.CDLL("mylib.so")

   my_log = ctypes.cast(mylib.my_log, ftype)
   my_exp = ctypes.cast(mylib.my_exp, ftype)

   bh.axis.Regular(10, 1, 4, transform=bh.axis.transform.Function(my_log, my_exp))


Note that you do actually have to cast it to the correct function type; just setting
``argtypes`` and ``restype`` does not work.

Picklable custom transforms
---------------------------

The above examples to not support pickling, since ctypes pointers (or pointers in general)
are not picklable. However, the ``Function`` transform supports a ``convert=`` keyword
argument that takes the two provided objects and converts them to ctypes pointers.
So if you can supply a pair of picklable objects and a conversion function, you can
make a fully picklable transform. A few common cases are given below.

Pure Python
^^^^^^^^^^^

This is the easiest example; as long as your Python function is picklable, all you need to do is move the
ctypes call into the convert function. You need a little wrapper function to make it picklable:

.. code-block:: python

   import ctypes, math


   # We need a little wrapper function only because ftype is not directly picklable
   def convert_python(func):
       ftype = ctypes.CFUNCTYPE(ctypes.c_double, ctypes.c_double)
       return ftype(func)


   bh.axis.Regular(
       10,
       1,
       4,
       transform=bh.axis.transform.Function(math.log, math.exp, convert=convert_python),
   )

That's it.

Using Numba
^^^^^^^^^^^

The same procedure works for numba decorators. NumPy only supports functions, not builtins like ``math.log``,
so if you want to pass those, you'll need to wrap them in a lambda function or add a bit of logic to the convert
function. Here are your options:

.. code-block:: python

    import numba, math


    def convert_numba(func):
        return numba.cfunc(numba.double(numba.double))(func)


    # Built-ins and ufuncs need to be wrapped (numba can't read a signature)
    # User functions would not need the lambda
    bh.axis.Regular(
        10,
        1,
        4,
        transform=bh.axis.transform.Function(
            lambda x: math.log(x), lambda x: math.exp(x), convert=convert_numba
        ),
    )

Note that ``numba.cfunc`` does not work on its own builtins, but requires a user function. Since with the exception
of the simple example I'm showing here that is already available directly in boost-histogram, you will probably be
composing your own functions out of more than one builtin operation, you generally will not need the lambda here.

Manual compilation
^^^^^^^^^^^^^^^^^^

You can use strings to look up functions in the shared library:

.. code-block:: python

   def lookup(name):
       mylib = ctypes.CDLL("mylib.so")
       function = getattr(mylib, name)
       return ctypes.cast(function, ftype)


   bh.axis.Regular(
       10, 1, 4, transform=bh.axis.transform.Function("my_log", "my_exp", convert=lookup)
   )