1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
|
Data Dependence Analysis in GridTools
=====================================
(copied from the wiki)
A multistage stencil is a sequence of stages.
Each stage is (basically) made of a stencil operator and a list of placeholders.
Each placeholders is associated to the arguments in the param_list of the stencil operator positionally,
that is, as it would happen if the placeholders were passed to the stencil operator according to the param_list.
The param_list lists accessors, and each accessor have an extent. Each accessor has also an intent,
which represents the use the stencil operator is going to make of the data, either read-only or read-write.
So, given a stage we can associate to each placeholder an extent and an intent, simply by scanning the accessors in the param_list of the stage.
Consider the last stage in a multistage computation. This will have one (or more) output accessors in the param_list of the stencil operator,
so one (or more) placeholders in the list. This output will be computed by accessing the inputs within the extents that are specified in the corresponding accessors.
Now, let's move to the stage just before that.
Some of its inputs may be used by the last stage as inputs.
This means that those outputs must be computed in a set of points that will be consumed by the next stage.
---------------------
A very simple example
---------------------
Let us consider an example (in 1D for simplicity). Suppose we have the following concatenation of stages,
where we write at the left the outputs and on the right, with an associated extent, are the inputs:
::
a <- f0(b<-1,1>, c<0,1>)
d <- f1(b<-2,0>, c<-1,2>)
e <- f2(a<-1,2>, d<-2,2>, c<-1,1>)
We call the arguments placeholders, to be closer to the |GT| algorithm. We have 5 placeholders and we start with an initial map such as:
::
a -> <0,0>
b -> <0,0>
c -> <0,0>
d -> <0,0>
e -> <0,0>
Now we visit our sequence of stages from the last to the first.
The last computes ``e`` and then needs ``a``, ``d``, and ``c`` in different extents. We update the map as follow:
::
a -> <-1,2>
b -> <0,0>
c -> <-1,1>
d -> <-2,2>
e -> <0,0>
Next we examine ``f1``. It writes ``d``, and since ``d`` is needed in an interval ``<-2,2>`` by ``f2``,
we need to update it's inputs that need now to be read at an extent that is ``<-2,2> + <x,y>``, where ``<x,y>`` is the extent of an input and the ``+`` operation corresponds to the following operation: ``<x,y> + <u,v> = <x+u, y+v>``. If the extent needed for the inputs are smaller than the one contained in the map already, we do not update the map, since the needed values are already there. We do it by using the ``mee`` function that returns the minimum enclosing extent of two extents. So now we update the map as follow:
::
a -> <-1,2>
b -> mee(<0,0>, <-2,2>+<-2,0>) = <-4,2>
c -> mee(<-1,1>, <-2,2>+<-1,2>) = <-3,4>
d -> <-2,2>
e -> <0,0>
Now to the last stage to examine: ``f0``. It writes ``a`` and needs ``b<-1,2>`` and ``c<0,1>``. According the the map, ``a`` is needed in ``<-1,2>`` so
::
a -> <-1,2>
b -> mee(<-4,2>, <-1,2>+<-1,1>) = mee(<-4,2>, <-2,3>) = <-4,3>
c -> mee(<-3,4>, <-1,2>+<0,1>) = mee(<-3,4>, <-1,3>) = <-3,4>
d -> <-2,2>
e -> <0,0>
The fields ``a`` and ``d`` are written and the read. They are eligible to be temporary fields. ``b`` and ``c`` are inputs,
and they are needed in ``<-4,3>`` and ``<-3,4>`` respectively. So the number of "halos" around them should be appropriate to avoid access violations.
The field ``e`` is the output and it's written in a single point.
If we compute ``f2`` in a point, we need to compute ``f1`` in ``<-2,2>``, since this will produce the values needed by ``f2``.
We then need to compute ``f0`` in ``<-1,2>`` to produce the values needed for ``a``.
----------------------
A more complex example
----------------------
The next example is very similar to the previous one, but now the second stage uses ``a`` instead of ``c``.
::
a <- f0(b<-1,1>, c<0,1>)
d <- f1(b<-2,0>, a<-1,2>)
e <- f2(a<-1,2>, d<-2,2>, c<-1,1>)
The map, then, after examining ``f2`` is as before
::
a -> <-1,2>
b -> <0,0>
c -> <-1,1>
d -> <-2,2>
e -> <0,0>
When we examine ``f1``, however, the extents become:
::
a -> mee(<-1,2>, <-2,2>+<-1,2>) = <-3,4>
b -> mee(<0,0>, <-2,2>+<-2,0>) = <-4,2>
c -> <-1,1>
d -> <-2,2>
e -> <0,0>
When we move to ``f0``:
::
a -> <-3,4>
b -> mee(<-4,2>, <-3,4>+<-1,1>) = <-4,5>
c -> mee(<-1,1>, <-3,4>+<0,1>) = <-3,5>
d -> <-2,2>
e -> <0,0>
So, now, to compute ``f2`` in a point we need to compute ``f1`` in ``<-2,2>`` and ``f0`` in ``<-3,4>``. Note that, when ``f2`` access ``a`` in ``<-1, 2>``,
those values have been computed in abundance to allow the computation of ``f1``.
-------------------------
Catching bad dependencies
-------------------------
Let's consider another variation on the same example. Now ``f1`` writes into ``c``, and ``d`` becomes an input.
::
a <- f0(b<-1,1>, c<0,1>)
c <- f1(b<-2,0>, a<-1,2>)
e <- f2(a<-1,2>, d<-2,2>, c<-1,1>)
As before the map after examining ``f2`` is
::
a -> <-1,2>
b -> <0,0>
c -> <-1,1>
d -> <-2,2>
e -> <0,0>
When we analyze ``f1`` we have
::
a -> mee(<-1,2>, <-1,1>+<-1,2>) = <-2,3>
b -> mee(<0,0>, <-1,1>+<-2,0>) = <-3,1>
c -> <-1,1>
d -> <-2,2>
e -> <0,0>
And then ``f0``
::
a -> <-2,3>
b -> mee(<-3,1>, <-2,3>+<-1,1>) = <-3,4>
c -> mee(<-1,1>, <-2,3>+<0,1>) = <-2,4>
d -> <-2,2>
e -> <0,0>
But now we have a problem, if we apply the previous strateg. ``f1`` is computed to compute ``c`` in ``<-2,4>``,
the extent associated by the map. A next element of the output would then read the modified values of ``c`` and
this is probably wrong. We need to catch this as a problem.
We cannot write into a field that is accessed in an extent different than ``<0,0>`` in a previous stage.
We need a check for this, but it is not yet implemented.
|