File: CommonArg

package info (click to toggle)
mlton 20130715-3
  • links: PTS
  • area: main
  • in suites: stretch
  • size: 60,900 kB
  • ctags: 69,386
  • sloc: xml: 34,418; ansic: 17,399; lisp: 2,879; makefile: 1,605; sh: 1,254; pascal: 256; python: 143; asm: 97
file content (381 lines) | stat: -rw-r--r-- 17,882 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="AsciiDoc 8.6.8">
<title>CommonArg</title>
<link rel="stylesheet" href="./asciidoc.css" type="text/css">
<link rel="stylesheet" href="./pygments.css" type="text/css">


<script type="text/javascript" src="./asciidoc.js"></script>
<script type="text/javascript">
/*<![CDATA[*/
asciidoc.install();
/*]]>*/
</script>
<link rel="stylesheet" href="./mlton.css" type="text/css"/>
</head>
<body class="article">
<div id="banner">
<div id="banner-home">
<a href="./Home">MLton 20130715</a>
</div>
</div>
<div id="header">
<h1>CommonArg</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph"><p><a href="CommonArg">CommonArg</a> is an optimization pass for the <a href="SSA">SSA</a>
<a href="IntermediateLanguage">IntermediateLanguage</a>, invoked from <a href="SSASimplify">SSASimplify</a>.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_description">Description</h2>
<div class="sectionbody">
<div class="paragraph"><p>It optimizes instances of <span class="monospaced">Goto</span> transfers that pass the same
arguments to the same label; e.g.</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
  ...
  z1 = ?
  ...
  L_3 (x, y, z1)
L_2 ()
  ...
  z2 = ?
  ...
  L_3 (x, y, z2)
L_3 (a, b, c)
  ...</pre>
</div></div>
<div class="paragraph"><p>This code can be simplified to:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
  ...
  z1 = ?
  ...
  L_3 (z1)
L_2 ()
  ...
  z2 = ?
  ...
  L_3 (z2)
L_3 (c)
  a = x
  b = y</pre>
</div></div>
<div class="paragraph"><p>which saves a number of resources: time of setting up the arguments
for the jump to <span class="monospaced">L_3</span>, space (either stack or pseudo-registers) for
the arguments of <span class="monospaced">L_3</span>, etc.  It may also expose some other
optimizations, if more information is known about <span class="monospaced">x</span> or <span class="monospaced">y</span>.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_implementation">Implementation</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
<a href="https://github.com/MLton/mlton/blob/master/mlton/ssa/common-arg.fun"><span class="monospaced">common-arg.fun</span></a>
</p>
</li>
</ul></div>
</div>
</div>
<div class="sect1">
<h2 id="_details_and_notes">Details and Notes</h2>
<div class="sectionbody">
<div class="paragraph"><p>Three analyses were originally proposed to drive the optimization
transformation.  Only the <em>Dominator Analysis</em> is currently
implemented.  (Implementations of the other analyses are available in
the <a href="Sources">repository history</a>.)</p></div>
<div class="sect2">
<h3 id="_syntactic_analysis">Syntactic Analysis</h3>
<div class="paragraph"><p>The simplest analysis I could think of maintains</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo: Var.t -&gt; Var.t option list ref</pre>
</div></div>
<div class="paragraph"><p>initialized to <span class="monospaced">[]</span>.</p></div>
<div class="ulist"><ul>
<li>
<p>
For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
<span class="monospaced">Function.t</span> args, then <span class="monospaced">List.push(varInfo v, NONE)</span>.
</p>
</li>
<li>
<p>
For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
formals of <span class="monospaced">L</span>, then <span class="monospaced">List.push(varInfo ai, SOME xi)</span>.
</p>
</li>
<li>
<p>
For each block argument a used in an unknown context (e.g.,
arguments of blocks used as continuations, handlers, arith success,
runtime return, or case switch labels), then
<span class="monospaced">List.push(varInfo a, NONE)</span>.
</p>
</li>
</ul></div>
<div class="paragraph"><p>Now, any block argument <span class="monospaced">a</span> such that <span class="monospaced">varInfo a = xs</span>, where all of
the elements of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span>, can be optimized by
setting <span class="monospaced">a = x</span> at the beginning of the block and dropping the
argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>That takes care of the example above.  We can clearly do slightly
better, by changing the transformation criteria to the following: any
block argument a such that <span class="monospaced">varInfo a = xs</span>, where all of the elements
of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span> <em>or</em> are equal to <span class="monospaced">SOME a</span>, can be
optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers.  This optimizes a case
like:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
  ... z1 = ? ...
  L_3 (x, y, z1)
L_2 ()
  ... z2 = ? ...
  L_3(x, y, z2)
L_3 (a, b, c)
  ... w = ? ...
  case w of
    true =&gt; L_4 | false =&gt; L_5
L_4 ()
   ...
   L_3 (a, b, w)
L_5 ()
   ...</pre>
</div></div>
<div class="paragraph"><p>where a common argument is passed to a loop (and is invariant through
the loop).  Of course, the <a href="LoopInvariant">LoopInvariant</a> optimization pass would
normally introduce a local loop and essentially reduce this to the
first example, but I have seen this in practice, which suggests that
some optimizations after <a href="LoopInvariant">LoopInvariant</a> do enough simplifications
to introduce (new) loop invariant arguments.</p></div>
</div>
<div class="sect2">
<h3 id="_fixpoint_analysis">Fixpoint Analysis</h3>
<div class="paragraph"><p>However, the above analysis and transformation doesn&#8217;t cover the cases
where eliminating one common argument exposes the opportunity to
eliminate other common arguments.  For example:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
  ...
  L_3 (x)
L_2 ()
  ...
  L_3 (x)
L_3 (a)
  ...
  L_5 (a)
L_4 ()
  ...
  L_5 (x)
L_5 (b)
  ...</pre>
</div></div>
<div class="paragraph"><p>One pass of analysis and transformation would eliminate the argument
to <span class="monospaced">L_3</span> and rewrite the <span class="monospaced">L_5(a)</span> transfer to <span class="monospaced">L_5 (x)</span>, thereby
exposing the opportunity to eliminate the common argument to <span class="monospaced">L_5</span>.</p></div>
<div class="paragraph"><p>The interdependency the arguments to <span class="monospaced">L_3</span> and <span class="monospaced">L_5</span> suggest
performing some sort of fixed-point analysis.  This analysis is
relatively simple; maintain</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo: Var.t -&gt; VarLattice.t</pre>
</div></div>
<div class="paragraph"><p>where</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>VarLattice.t ~=~ Bot | Point of Var.t | Top</pre>
</div></div>
<div class="paragraph"><p>(but is implemented by the <a href="FlatLattice">FlatLattice</a> functor with a <span class="monospaced">lessThan</span>
list and <span class="monospaced">value ref</span> under the hood), initialized to <span class="monospaced">Bot</span>.</p></div>
<div class="ulist"><ul>
<li>
<p>
For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
<span class="monospaced">Function.t</span> args, then <span class="monospaced">VarLattice.&lt;= (Point v, varInfo v)</span>
</p>
</li>
<li>
<p>
For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
formals of <span class="monospaced">L</span>}, then <span class="monospaced">VarLattice.&lt;= (varInfo xi, varInfo ai)</span>.
</p>
</li>
<li>
<p>
For each block argument a used in an unknown context, then
<span class="monospaced">VarLattice.&lt;= (Point a, varInfo a)</span>.
</p>
</li>
</ul></div>
<div class="paragraph"><p>Now, any block argument a such that <span class="monospaced">varInfo a = Point x</span> can be
optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>Now, with the last example, we introduce the ordering constraints:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo x &lt;= varInfo a
varInfo a &lt;= varInfo b
varInfo x &lt;= varInfo b</pre>
</div></div>
<div class="paragraph"><p>Assuming that <span class="monospaced">varInfo x = Point x</span>, then we get <span class="monospaced">varInfo a = Point x</span>
and <span class="monospaced">varInfo b = Point x</span>, and we optimize the example as desired.</p></div>
<div class="paragraph"><p>But, that is a rather weak assumption.  It&#8217;s quite possible for
<span class="monospaced">varInfo x = Top</span>.  For example, consider:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>G_1 ()
  ... n = 1 ...
  L_0 (n)
G_2 ()
  ... m = 2 ...
  L_0 (m)
L_0 (x)
  ...
L_1 ()
  ...
  L_3 (x)
L_2 ()
  ...
  L_3 (x)
L_3 (a)
  ...
  L_5(a)
L_4 ()
  ...
  L_5(x)
L_5 (b)
   ...</pre>
</div></div>
<div class="paragraph"><p>Now <span class="monospaced">varInfo x = varInfo a = varInfo b = Top</span>.  What went wrong here?
When <span class="monospaced">varInfo x</span> went to <span class="monospaced">Top</span>, it got propagated all the way through
to <span class="monospaced">a</span> and <span class="monospaced">b</span>, and prevented the elimination of any common arguments.
What we&#8217;d like to do instead is when <span class="monospaced">varInfo x</span> goes to <span class="monospaced">Top</span>,
propagate on <span class="monospaced">Point x</span>&#8201;&#8212;&#8201;we have no hope of eliminating <span class="monospaced">x</span>, but if
we hold <span class="monospaced">x</span> constant, then we have a chance of eliminating arguments
for which <span class="monospaced">x</span> is passed as an actual.</p></div>
</div>
<div class="sect2">
<h3 id="_dominator_analysis">Dominator Analysis</h3>
<div class="paragraph"><p>Does anyone see where this is going yet?  Pausing for a little
thought, <a href="MatthewFluet">MatthewFluet</a> realized that he had once before tried
proposing this kind of "fix" to a fixed-point analysis&#8201;&#8212;&#8201;when we were
first investigating the <a href="Contify">Contify</a> optimization in light of John
Reppy&#8217;s CWS paper.  Of course, that "fix" failed because it defined a
non-monotonic function and one couldn&#8217;t take the fixed point.  But,
<a href="StephenWeeks">StephenWeeks</a> suggested a dominator based approach, and we were
able to show that, indeed, the dominator analysis subsumed both the
previous call based analysis and the cont based analysis.  And, a
moment&#8217;s reflection reveals further parallels: when
<span class="monospaced">varInfo: Var.t -&gt; Var.t option list ref</span>, we have something analogous
to the call analysis, and when <span class="monospaced">varInfo: Var.t -&gt; VarLattice.t</span>, we
have something analogous to the cont analysis.  Maybe there is
something analogous to the dominator approach (and therefore superior
to the previous analyses).</p></div>
<div class="paragraph"><p>And this turns out to be the case.  Construct the graph <span class="monospaced">G</span> as follows:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>nodes(G) = {Root} U Var.t
edges(G) = {Root -&gt; v | v bound in a Statement.t or
                                in the Function.t args} U
           {xi -&gt; ai | L(x1, ..., xn) transfer where (a1, ..., an)
                                      are the formals of L} U
           {Root -&gt; a | a is a block argument used in an unknown context}</pre>
</div></div>
<div class="paragraph"><p>Let <span class="monospaced">idom(x)</span> be the immediate dominator of <span class="monospaced">x</span> in <span class="monospaced">G</span> with root
<span class="monospaced">Root</span>.  Now, any block argument a such that <span class="monospaced">idom(a) = x &lt;&gt; Root</span> can
be optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>Furthermore, experimental evidence suggests (and we are confident that
a formal presentation could prove) that the dominator analysis
subsumes the "syntactic" and "fixpoint" based analyses in this context
as well and that the dominator analysis gets "everything" in one go.</p></div>
</div>
<div class="sect2">
<h3 id="_final_thoughts">Final Thoughts</h3>
<div class="paragraph"><p>I must admit, I was rather surprised at this progression and final
result.  At the outset, I never would have thought of a connection
between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> optimizations.  They would seem
to be two completely different optimizations.  Although, this may not
really be the case.  As one of the reviewers of the ICFP paper said:</p></div>
<div class="quoteblock">
<div class="content">
<div class="paragraph"><p>I understand that such a form of CPS might be convenient in some
cases, but when we&#8217;re talking about analyzing code to detect that some
continuation is constant, I think it makes a lot more sense to make
all the continuation arguments completely explicit.</p></div>
<div class="paragraph"><p>I believe that making all the continuation arguments explicit will
show that the optimization can be generalized to eliminating constant
arguments, whether continuations or not.</p></div>
</div>
<div class="attribution">
</div></div>
<div class="paragraph"><p>What I think the common argument optimization shows is that the
dominator analysis does slightly better than the reviewer puts it: we
find more than just constant continuations, we find common
continuations.  And I think this is further justified by the fact that
I have observed common argument eliminate some <span class="monospaced">env_X</span> arguments which
would appear to correspond to determining that while the closure being
executed isn&#8217;t constant it is at least the same as the closure being
passed elsewhere.</p></div>
<div class="paragraph"><p>At first, I was curious whether or not we had missed a bigger picture
with the dominator analysis.  When we wrote the contification paper, I
assumed that the dominator analysis was a specialized solution to a
specialized problem; we never suggested that it was a technique suited
to a larger class of analyses.  After initially finding a connection
between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> (and thinking that the only
connection was the technique), I wondered if the dominator technique
really was applicable to a larger class of analyses.  That is still a
question, but after writing up the above, I&#8217;m suspecting that the
"real story" is that the dominator analysis is a solution to the
common argument optimization, and that the <a href="Contify">Contify</a> optimization is
specializing <a href="CommonArg">CommonArg</a> to the case of continuation arguments (with
a different transformation at the end).  (Note, a whole-program,
inter-procedural common argument analysis doesn&#8217;t really make sense
(in our <a href="SSA">SSA</a> <a href="IntermediateLanguage">IntermediateLanguage</a>), because the only way of
passing values between functions is as arguments.  (Unless of course
in the case that the common argument is also a constant argument, in
which case <a href="ConstantPropagation">ConstantPropagation</a> could lift it to a global.)  The
inter-procedural <a href="Contify">Contify</a> optimization works out because there we
move the function to the argument.)</p></div>
<div class="paragraph"><p>Anyways, it&#8217;s still unclear to me whether or not the dominator based
approach solves other kinds of problems.</p></div>
</div>
<div class="sect2">
<h3 id="_phase_ordering">Phase Ordering</h3>
<div class="paragraph"><p>On the downside, the optimization doesn&#8217;t have a huge impact on
runtime, although it does predictably saved some code size.  I stuck
it in the optimization sequence after <a href="Flatten">Flatten</a> and (the third round
of) <a href="LocalFlatten">LocalFlatten</a>, since it seems to me that we could have cases
where some components of a tuple used as an argument are common, but
the whole tuple isn&#8217;t.  I think it makes sense to add it after
<a href="IntroduceLoops">IntroduceLoops</a> and <a href="LoopInvariant">LoopInvariant</a> (even though <a href="CommonArg">CommonArg</a>
get some things that <a href="LoopInvariant">LoopInvariant</a> gets, it doesn&#8217;t get all of
them).  I also think that it makes sense to add it before
<a href="CommonSubexp">CommonSubexp</a>, since identifying variables could expose more common
subexpressions.  I would think a similar thought applies to
<a href="RedundantTests">RedundantTests</a>.</p></div>
</div>
</div>
</div>
</div>
<div id="footnotes"><hr></div>
<div id="footer">
<div id="footer-text">
</div>
<div id="footer-badges">
</div>
</div>
</body>
</html>