1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
|
===========
Performance
===========
The main aim of lazyarray is to improve performance (increased speed and
reduced memory use) in two scenarios:
* arrays are used conditionally (i.e. there are cases in which the array is
never used);
* only parts of an array are used (for example in distributed computation,
in which each MPI node operates on a subset of the elements of the array).
However, at the same time use of :class:`larray` objects should not be too much
slower than plain NumPy arrays in normal use.
Here we see that using a lazyarray adds minimal overhead compared to using a
plain array:
::
>>> from timeit import repeat
>>> repeat('np.fromfunction(lambda i,j: i*i + 2*i*j + 3, (5000, 5000))',
... setup='import numpy as np', number=1, repeat=5)
[1.9397640228271484, 1.92628812789917, 1.8796701431274414, 1.6766629219055176, 1.6844701766967773]
>>> repeat('larray(lambda i,j: i*i + 2*i*j + 3, (5000, 5000)).evaluate()',
... setup='from lazyarray import larray', number=1, repeat=5)
[1.686661958694458, 1.6836578845977783, 1.6853220462799072, 1.6538069248199463, 1.645576000213623]
while if we only need to evaluate part of the array (perhaps because the other
parts are being evaluated on other nodes), there is a major gain from using a
lazy array.
::
>>> repeat('np.fromfunction(lambda i,j: i*i + 2*i*j + 3, (5000, 5000))[:, 0:4999:10]',
... setup='import numpy as np', number=1, repeat=5)
[1.691796064376831, 1.668884038925171, 1.647057056427002, 1.6792259216308594, 1.652547836303711]
>>> repeat('larray(lambda i,j: i*i + 2*i*j + 3, (5000, 5000))[:, 0:4999:10]',
... setup='from lazyarray import larray', number=1, repeat=5)
[0.23157119750976562, 0.16121792793273926, 0.1594078540802002, 0.16096210479736328, 0.16096997261047363]
.. note:: These timings were done on a MacBook Pro 2.8 GHz Intel Core 2 Duo
with 4 GB RAM, Python 2.7 and NumPy 1.6.1.
|