File: INTERNAL.rst

package info (click to toggle)
pythran 0.17.0%2Bds-1
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 12,700 kB
  • sloc: cpp: 65,021; python: 41,083; sh: 137; makefile: 87
file content (280 lines) | stat: -rw-r--r-- 8,460 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
Internals
#########

This document describes some internals of Pythran compiler.

Pythran pass management is used throughout the document::

    >>> from pythran import passmanager, analyses, optimizations, backend
    >>> pm = passmanager.PassManager('dummy')

To retrieve the code source from a function definition, the ``inspect`` module
is used::

    >>> from inspect import getsource

And to turn source code into an AST(Abstract Syntax tree), Python provides the
``ast`` module::

    >>> import gast as ast
    >>> getast = lambda f: ast.parse(getsource(f))

Scoping
-------

There are only two scopes in Python: ``globals()`` and ``locals()``. When
generating C++ code, Pythran tries its best not to declare variables at the
function level, but using the deepest scope. This provides two benefits:

1. It makes writing OpenMP clauses easier, as local variables are automatically
   marked as private;
2. It avoids to build variables with the empty constructor then assigning them a
   value.

Let's illustrate this with two simple examples. In the following function,
variable ``a`` has to be declared outside of the ``if`` statement::

    >>> def foo(n):
    ...     if n:
    ...         a = 1
    ...     else:
    ...         a = 2
    ...     return n*a

When computing variable scope, one gets a dictionary binding nodes to variable names::

    >>> foo_tree = getast(foo)
    >>> scopes = pm.gather(analyses.Scope, foo_tree)

``n`` is a formal parameter, so it has function scope::

    >>> sorted(scopes[foo_tree.body[0]])
    ['a', 'n']


``a`` is used at the function scope (in the ``return`` statement), so even if
it's declared in an ``if`` it has function scpe too.


Now let's see what happen if we add a loop to the function::

    >>> def foo(n):
    ...     s = 0
    ...     for i in __builtin__.range(n):
    ...         if i:
    ...             a = 1
    ...         else:
    ...             a = 2
    ...         s *= a
    ...     return s
    >>> foo_tree = getast(foo)
    >>> scopes = pm.gather(analyses.Scope, foo_tree)

Variable ``a`` is only used in the loop body, so one can declare it inside the
loop::

    >>> scopes[tuple(foo_tree.body[0].body[1].body)]
    {'a'}

In a similar manner, the iteration variable ``i`` gets a new value at each
iteration step, and is declared at the loop level.

OpenMP directives interacts a lot with scoping. In C or C++, variables declared
inside a parallel region are automatically marked as private. Pythran emulates
this whenever possible::

    >>> def foo(n):
    ...     s = 0
    ...     "omp parallel for reduction(*:s)"
    ...     for i in __builtin__.range(n):
    ...         if i:
    ...             a = 1
    ...         else:
    ...             a = 2
    ...         s += a
    ...     return s

Without scoping directive, both ``i`` and ``a`` are private::

    >>> foo_tree = getast(foo)
    >>> scopes = pm.gather(analyses.Scope, foo_tree)
    >>> scopes[foo_tree.body[0].body[2]]
    {'i'}
    >>> scopes[tuple(foo_tree.body[0].body[2].body)]
    {'a'}

But if one adds a
``lastprivate`` clause, as in::

    >>> def foo(n):
    ...     s = 0
    ...     a = 0
    ...     "omp parallel for reduction(*:s) lastprivate(a)"
    ...     for i in __builtin__.range(n):
    ...         if i:
    ...             a = 1
    ...         else:
    ...             a = 2
    ...         s += a
    ...     return s, a
    >>> foo_tree = getast(foo)

The scope information change. Pythran first needs to understand OpenMP
directives, using a dedicated pass::

    >>> from pythran import openmp
    >>> _ = pm.apply(openmp.GatherOMPData, foo_tree)

Then let's have a look to ::

    >>> scopes = pm.gather(analyses.Scope, foo_tree)
    >>> list(scopes[foo_tree.body[0].body[2]])  # 3nd element: omp got parsed
    ['i']
    >>> list(scopes[foo_tree.body[0]])
    ['n']
    >>> list(scopes[foo_tree.body[0].body[0]])
    ['s']
    >>> list(scopes[foo_tree.body[0].body[1]])
    ['a']

``a`` now has function body scope, which keeps the OpenMP directive legal.

When the scope can be attached to an assignment, Pythran uses this piece of information::

    >>> def foo(n):
    ...     s = 0
    ...     "omp parallel for reduction(*:s)"
    ...     for i in __builtin__.range(n):
    ...         a = 2
    ...         s *= a
    ...     return s
    >>> foo_tree = getast(foo)
    >>> _ = pm.apply(openmp.GatherOMPData, foo_tree)
    >>> scopes = pm.gather(analyses.Scope, foo_tree)
    >>> scopes[foo_tree.body[0].body[1].body[0]] == set(['a'])
    True

Additionally, some OpenMP directives, when applied to a single statement, are
treated by Pythran as if they created a bloc, emulated by a dummy
conditional::

    >>> def foo(n):
    ...     "omp parallel"
    ...     "omp single"
    ...     s = 1
    ...     return s
    >>> foo_tree = getast(foo)
    >>> _ = pm.apply(openmp.GatherOMPData, foo_tree)
    >>> print(pm.dump(backend.Python, foo_tree))
    def foo(n):
        'omp parallel'
        'omp single'
        if 1:
            s = 1
        return s

However the additional if bloc makes it clear that ``s`` should have function
scope, and the scope is not attached to the first assignment::

    >>> scopes = pm.gather(analyses.Scope, foo_tree)
    >>> scopes[foo_tree.body[0]] == set(['s'])
    True


Lazyness
--------

``Expressions templates`` used by numpy internal representation enable laziness
computation. It means that operations will be computed only during assignation
to avoid intermediate array allocation and improve data locality.
Laziness analysis enable Expression template even if there is multiple
assignment in some case.

Let's go for some examples.
In ``foo``, no intermediate array are create for ``+`` and ``*`` operations and
for each elements, two operations are apply at once instead of one by one::

    >>> def foo(array):
    ...     return array * 5 + 3

It also apply for other unary operations with numpy array.
In this example, laziness doesn't change anything as is it a typical case for
Expression templates but peoples may write::

    >>> def foo(array):
    ...     a = array * 5
    ...     return a + 3

Result is the same but there is a temporary array. This case is detected as
lazy and instead of saving the result of ``array * 5`` in ``a``, we save an
Expression template type ``numpy_expr<operator*, ndarray, int>`` instead of an
evaluated ``ndarray``.

Now, have a look at the lazyness analysis's result::

    >>> foo_tree = getast(foo)
    >>> lazyness = pm.gather(analyses.LazynessAnalysis, foo_tree)

``array`` is a parameter so even if we count use, it can't be lazy::

    >>> lazyness['a']
    1

It returns the number of use of a variable.

Special case is for intermediate use::

    >>> def foo(array):
    ...     a = array * 2
    ...     b = a + 2
    ...     a = array * 5
    ...     return a, b

In this case, ``b`` is only use once BUT ``b`` depend on ``a`` and ``a`` change
before the use of ``b``.
In this case, ``b`` can't be lazy so its values is ``inf``::

    >>> foo_tree = getast(foo)
    >>> lazyness = pm.gather(analyses.LazynessAnalysis, foo_tree)
    >>> sorted(lazyness.items())
    [('a', 1), ('array', 2), ('b', inf)]

We can notice that a reassignment reinitializes its value so even if ``a`` is
used twice, its counters returns ``1``.  ``inf`` also happen in case of
subscript use as we need to compute the value to subscript on it. Updated
values can't be lazy too and variables used in loops too. Laziness also cares
about aliased values::

    >>> def foo(array):
    ...     a = array * 2
    ...     b = a
    ...     a_ = b * 5
    ...     return a_
    >>> foo_tree = getast(foo)
    >>> lazyness = pm.gather(analyses.LazynessAnalysis, foo_tree)
    >>> sorted(lazyness.items())
    [('a', 1), ('a_', 1), ('array', 1), ('b', 1)]


Doc Strings
-----------

Pythran preserves docstrings::

    $> printf '#pythran export foo()\n\"top-level-docstring\"\n\ndef foo():\n  \"function-level-docstring\"\n  return 2' > docstrings.py
    $> pythran docstrings.py
    $> python -c 'import docstrings; print(docstrings.__doc__); print(docstrings.foo.__doc__)'
    top-level-docstring
    function-level-docstring
    
        Supported prototypes:
    
        - foo()
    $> rm -f docstrings.*


PyPy3 support
-------------

Pythran has been said to work well with PyPy3.6 v7.2.0. However, this setup is
not yet tested on Travis so compilation failure may happen. Report them!