File: tutorial.txt

package info (click to toggle)
lazyarray 0.5.2-3
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 228 kB
  • sloc: python: 1,251; makefile: 109
file content (415 lines) | stat: -rw-r--r-- 11,559 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
========
Tutorial
========

The :mod:`lazyarray` module contains a single class, :class:`larray`.

.. doctest::

    >>> from lazyarray import larray


Creating a lazy array
=====================

Lazy arrays may be created from single numbers, from sequences (lists, NumPy
arrays), from iterators, from generators, or from a certain class of functions.
Here are some examples:

.. doctest::

    >>> from_number = larray(20.0)
    >>> from_list = larray([0, 1, 1, 2, 3, 5, 8])
    >>> import numpy as np
    >>> from_array = larray(np.arange(6).reshape((2, 3)))
    >>> from_iter = larray(iter(range(8)))
    >>> from_gen = larray((x**2 + 2*x + 3 for x in range(5)))
    
To create a lazy array from a function or other callable, the function must
accept one or more integers as arguments (depending on the dimensionality of
the array) and return a single number.

.. doctest::

    >>> def f(i, j):
    ...     return i*np.sin(np.pi*j/100)
    >>> from_func = larray(f)

Specifying array shape
----------------------

Where the :class:`larray` is created from something that does not already have
a known shape (i.e. from something that is not a list or array), it is possible
to specify the shape of the array at the time of construction:

.. doctest::

    >>> from_func2 = larray(lambda i: 2*i, shape=(6,))
    >>> print(from_func2.shape)
    (6,)

For sequences, the shape is introspected:

.. doctest::

    >>> from_list.shape
    (7,)
    >>> from_array.shape
    (2, 3)

Otherwise, the :attr:`shape` attribute is set to ``None``, and must be set later
before the array can be evaluated.

.. doctest::

    >>> print(from_number.shape)
    None
    >>> print(from_iter.shape)
    None
    >>> print(from_gen.shape)
    None
    >>> print(from_func.shape)
    None


Evaluating a lazy array
=======================

The simplest way to evaluate a lazy array is with the :meth:`evaluate` method,
which returns a NumPy array:

.. doctest::

    >>> from_list.evaluate()
    array([0, 1, 1, 2, 3, 5, 8])
    >>> from_array.evaluate()
    array([[0, 1, 2],
           [3, 4, 5]])
    >>> from_number.evaluate()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/andrew/dev/lazyarray/lazyarray.py", line 35, in wrapped_meth
        raise ValueError("Shape of larray not specified")
    ValueError: Shape of larray not specified
    >>> from_number.shape = (2, 2)
    >>> from_number.evaluate()
    array([[ 20.,  20.],
           [ 20.,  20.]])

Note that an :class:`larray` can only be evaluated once its shape has been
defined. Note also that a lazy array created from a single number evaluates to
a homogeneous array containing that number. To obtain just the value, use the
``simplify`` argument:

.. doctest::

    >>> from_number.evaluate(simplify=True)
    20.0

Evaluating a lazy array created from an iterator or generator fills the array
in row-first order. The number of values generated by the iterator must fit
within the array shape:

.. doctest::

    >>> from_iter.shape = (2, 4)
    >>> from_iter.evaluate()
    array([[ 0.,  1.,  2.,  3.],
           [ 4.,  5.,  6.,  7.]])    
    >>> from_gen.shape = (5,)
    >>> from_gen.evaluate()
    array([  3.,   6.,  11.,  18.,  27.])

If it doesn't, an Exception is raised:

.. doctest::

    >>> from_iter.shape = (7,)
    >>> from_iter.evaluate()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
        from_iter.evaluate()
      File "/Users/andrew/dev/lazyarray/lazyarray.py", line 36, in wrapped_meth
        return meth(self, *args, **kwargs)
      File "/Users/andrew/dev/lazyarray/lazyarray.py", line 235, in evaluate
        x = x.reshape(self.shape)
    ValueError: total size of new array must be unchanged
    
When evaluating a lazy array created from a callable, the function is called
with the indices of each element of the array:
    
.. doctest::

    >>> from_func.shape = (3, 4)
    >>> from_func.evaluate()
    array([[ 0.        ,  0.        ,  0.        ,  0.        ],
           [ 0.        ,  0.03141076,  0.06279052,  0.09410831],
           [ 0.        ,  0.06282152,  0.12558104,  0.18821663]])


It is also possible to evaluate only parts of an array. This is explained below.


Performing operations on a lazy array
=====================================

Just as with a normal NumPy array, it is possible to perform elementwise
arithmetic operations:

.. doctest::

    >>> a = from_list + 2
    >>> b = 2*a
    >>> print(type(b))
    <class 'lazyarray.larray'>

However, these operations are not carried out immediately, rather they are
queued up to be carried out later, which can lead to large time and memory
savings if the evaluation step turns out later not to be needed, or if only
part of the array needs to be evaluated.

.. doctest::

    >>> b.evaluate()
    array([ 4,  6,  6,  8, 10, 14, 20])
    
Some more examples:

.. doctest::

    >>> a = 1.0/(from_list + 1)
    >>> a.evaluate()
    array([ 1.        ,  0.5       ,  0.5       ,  0.33333333,  0.25      ,
            0.16666667,  0.11111111])
    >>> (from_list < 2).evaluate()
    array([ True,  True,  True, False, False, False, False], dtype=bool)
    >>> (from_list**2).evaluate()
    array([ 0,  1,  1,  4,  9, 25, 64])
    >>> x = from_list
    >>> (x**2 - 2*x + 5).evaluate()
    array([ 5,  4,  4,  5,  8, 20, 53])
    
Numpy ufuncs cannot be used directly with lazy arrays, as NumPy does not know
what to do with :class:`larray` objects. The lazyarray module therefore provides
lazy array-compatible versions of a subset of the NumPy ufuncs, e.g.:

.. doctest::

    >>> from lazyarray import sqrt
    >>> sqrt(from_list).evaluate()
    array([ 0.        ,  1.        ,  1.        ,  1.41421356,  1.73205081,
            2.23606798,  2.82842712])

For any other function that operates on a NumPy array, it can be applied to a
lazy array using the :meth:`apply()` method:

.. doctest::

    >>> def g(x):
    ...    return x**2 - 2*x + 5
    >>> from_list.apply(g)
    >>> from_list.evaluate()
    array([ 5,  4,  4,  5,  8, 20, 53])


Partial evaluation
==================

When accessing a single element of an array, only that element is evaluated,
where possible, not the whole array:

.. doctest::

    >>> x = larray(lambda i,j: 2*i + 3*j, shape=(4, 5))
    >>> x[3, 2]
    12
    >>> y = larray(lambda i: i*(2-i), shape=(6,))
    >>> y[4]
    -8

The same is true for accessing individual rows or columns:

.. doctest::

    >>> x[1]
    array([ 2,  5,  8, 11, 14])
    >>> x[:, 4]
    array([12, 14, 16, 18])
    >>> x[:, (0, 4)]
    array([[ 0, 12],
           [ 2, 14],
           [ 4, 16],
           [ 6, 18]])


Creating lazy arrays from SciPy sparse matrices
===============================================

Lazy arrays may also be created from SciPy sparse matrices. There are 7 different sparse matrices.

- csc_matrix(arg1[, shape, dtype, copy])	            Compressed Sparse Column matrix
- csr_matrix(arg1[, shape, dtype, copy])	            Compressed Sparse Row matrix
- bsr_matrix(arg1[, shape, dtype, copy, blocksize])	    Block Sparse Row matrix
- lil_matrix(arg1[, shape, dtype, copy])	            Row-based linked list sparse matrix
- dok_matrix(arg1[, shape, dtype, copy])	            Dictionary Of Keys based sparse matrix.
- coo_matrix(arg1[, shape, dtype, copy])	            A sparse matrix in COOrdinate format.
- dia_matrix(arg1[, shape, dtype, copy])	            Sparse matrix with DIAgonal storage

Here are some examples to use them.


Creating sparse matrices
------------------------

Sparse matrices comes from SciPy package for numerical data.
First to use them it is necessary to import libraries.

.. doctest::

    >>> import numpy as np
    >>> from lazyarray import larray
    >>> from scipy.sparse import bsr_matrix, coo_matrix, csc_matrix, csr_matrix, dia_matrix, dok_matrix, lil_matrix

Creating a sparse matrix requires filling each row and column with data. For example :

.. doctest::

    >>> row = np.array([0, 2, 2, 0, 1, 2])
    >>> col = np.array([0, 0, 1, 2, 2, 2])
    >>> data = np.array([1, 2, 3, 4, 5, 6])

The 7 sparse matrices are not defined in the same way.

The bsr_matrix, coo_matrix, csc_matrix and csr_matrix are defined as follows :

.. doctest::

    >>> sparr = bsr_matrix((data, (row, col)), shape=(3, 3))
    >>> sparr = coo_matrix((data, (row, col)), shape=(3, 3))
    >>> sparr = csc_matrix((data, (row, col)), shape=(3, 3))
    >>> sparr = csr_matrix((data, (row, col)), shape=(3, 3))

In regards to the dia_matrix :

.. doctest::

    >>> data_dia = np.array([[1, 2, 3, 4]]).repeat(3, axis=0)
    >>> offsets = np.array([0, -1, 2])
    >>> sparr = dia_matrix((data_dia, offsets), shape=(4, 4))

For the dok_matrix :

.. doctest::

    >>> sparr = dok_matrix(((row, col)), shape=(3, 3))

For the lil_matrix :

.. doctest::

    >>> sparr = lil_matrix(data, shape=(3, 3)) 

In the continuation of this tutorial, the sparse matrix used will be called sparr and refers to the csc_matrix.

It is possible to convert the sparse matrix as a NumPy array, as follows:

.. doctest::

    >>> print(sparr.toarray())
    array([[1, 0, 4],
           [0, 0, 5],
           [2, 3, 6]])


Specifying the shape and the type of a sparse matrix
----------------------------------------------------

To know the shape and the type of the sparse matrices, you can use :

.. doctest::

    >>> larr = larray(sparr)
    >>> print (larr.shape)
    (3, 3)
    >>> print (larr.dtype)
    dtype('int64')


Evaluating a sparse matrix
--------------------------

Evaluating a sparse matrix refers to the evaluate() method, which returns a NumPy array :

.. doctest::

    >>> print (larr.evaluate())
    array([[1, 0, 4],
           [0, 0, 5],
           [2, 3, 6]])

When creating a sparse matrix, some values ​​may remain empty.
In this case, the evaluate () method has the argument, called empty_val, referring to the special value nan, for Not a Number, defined in NumPy.
This method fills these empty with this nan value.

.. doctest::

    >>> print (larr.evaluate(empty_val=np.nan))
    array([[1, nan, 4],
           [nan, nan, 5],
           [2, 3, 6]])


Accessing individual rows or columns of a sparse matrix
-------------------------------------------------------

To access specific elements of the matrix, like individual rows or columns :

.. doctest::

    >>> larr[2, :]

In this case, the third line of the sparse matrix is obtained.
However, this method is different depending on the sparse matrices used :

For csc_matrix and csr_matrix :

.. doctest::

    >>> print (larr[2, :])
    array([2, 3, 6])

During execution, the matrices bsr_matrix, coo_matrix and dia_matrix, do not support indexing.
The solution is to convert them to another format.
It is therefore necessary to go through csr_matrix in order to perform the calculation.

.. doctest::

    >>> print(sparr.tocsr()[2,:])

Depending on the definition given previously to the matrix, for the dok_matrix :

.. doctest::

    >>> print (larr[1, :])

And for lil_matrix :

.. doctest::

    >>> print (larr[0, :])

In case we want to access an element of a column, we must proceed in the same way as previously, by changing index.
Here is an example of how to access an item in the third column of the sparse matrix.

.. doctest::

    >>> larr[:, 2]

Finally, to have information on the sparse matrix :

.. doctest::

    >>>print (larr.base_value)
    <3x3 sparse matrix of type '<class 'numpy.int64'>'
    	with 6 stored elements in Compressed Sparse Column format>