1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381
|
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="AsciiDoc 8.6.8">
<title>CommonArg</title>
<link rel="stylesheet" href="./asciidoc.css" type="text/css">
<link rel="stylesheet" href="./pygments.css" type="text/css">
<script type="text/javascript" src="./asciidoc.js"></script>
<script type="text/javascript">
/*<![CDATA[*/
asciidoc.install();
/*]]>*/
</script>
<link rel="stylesheet" href="./mlton.css" type="text/css"/>
</head>
<body class="article">
<div id="banner">
<div id="banner-home">
<a href="./Home">MLton 20130715</a>
</div>
</div>
<div id="header">
<h1>CommonArg</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph"><p><a href="CommonArg">CommonArg</a> is an optimization pass for the <a href="SSA">SSA</a>
<a href="IntermediateLanguage">IntermediateLanguage</a>, invoked from <a href="SSASimplify">SSASimplify</a>.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_description">Description</h2>
<div class="sectionbody">
<div class="paragraph"><p>It optimizes instances of <span class="monospaced">Goto</span> transfers that pass the same
arguments to the same label; e.g.</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
...
z1 = ?
...
L_3 (x, y, z1)
L_2 ()
...
z2 = ?
...
L_3 (x, y, z2)
L_3 (a, b, c)
...</pre>
</div></div>
<div class="paragraph"><p>This code can be simplified to:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
...
z1 = ?
...
L_3 (z1)
L_2 ()
...
z2 = ?
...
L_3 (z2)
L_3 (c)
a = x
b = y</pre>
</div></div>
<div class="paragraph"><p>which saves a number of resources: time of setting up the arguments
for the jump to <span class="monospaced">L_3</span>, space (either stack or pseudo-registers) for
the arguments of <span class="monospaced">L_3</span>, etc. It may also expose some other
optimizations, if more information is known about <span class="monospaced">x</span> or <span class="monospaced">y</span>.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_implementation">Implementation</h2>
<div class="sectionbody">
<div class="ulist"><ul>
<li>
<p>
<a href="https://github.com/MLton/mlton/blob/master/mlton/ssa/common-arg.fun"><span class="monospaced">common-arg.fun</span></a>
</p>
</li>
</ul></div>
</div>
</div>
<div class="sect1">
<h2 id="_details_and_notes">Details and Notes</h2>
<div class="sectionbody">
<div class="paragraph"><p>Three analyses were originally proposed to drive the optimization
transformation. Only the <em>Dominator Analysis</em> is currently
implemented. (Implementations of the other analyses are available in
the <a href="Sources">repository history</a>.)</p></div>
<div class="sect2">
<h3 id="_syntactic_analysis">Syntactic Analysis</h3>
<div class="paragraph"><p>The simplest analysis I could think of maintains</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo: Var.t -> Var.t option list ref</pre>
</div></div>
<div class="paragraph"><p>initialized to <span class="monospaced">[]</span>.</p></div>
<div class="ulist"><ul>
<li>
<p>
For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
<span class="monospaced">Function.t</span> args, then <span class="monospaced">List.push(varInfo v, NONE)</span>.
</p>
</li>
<li>
<p>
For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
formals of <span class="monospaced">L</span>, then <span class="monospaced">List.push(varInfo ai, SOME xi)</span>.
</p>
</li>
<li>
<p>
For each block argument a used in an unknown context (e.g.,
arguments of blocks used as continuations, handlers, arith success,
runtime return, or case switch labels), then
<span class="monospaced">List.push(varInfo a, NONE)</span>.
</p>
</li>
</ul></div>
<div class="paragraph"><p>Now, any block argument <span class="monospaced">a</span> such that <span class="monospaced">varInfo a = xs</span>, where all of
the elements of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span>, can be optimized by
setting <span class="monospaced">a = x</span> at the beginning of the block and dropping the
argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>That takes care of the example above. We can clearly do slightly
better, by changing the transformation criteria to the following: any
block argument a such that <span class="monospaced">varInfo a = xs</span>, where all of the elements
of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span> <em>or</em> are equal to <span class="monospaced">SOME a</span>, can be
optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers. This optimizes a case
like:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
... z1 = ? ...
L_3 (x, y, z1)
L_2 ()
... z2 = ? ...
L_3(x, y, z2)
L_3 (a, b, c)
... w = ? ...
case w of
true => L_4 | false => L_5
L_4 ()
...
L_3 (a, b, w)
L_5 ()
...</pre>
</div></div>
<div class="paragraph"><p>where a common argument is passed to a loop (and is invariant through
the loop). Of course, the <a href="LoopInvariant">LoopInvariant</a> optimization pass would
normally introduce a local loop and essentially reduce this to the
first example, but I have seen this in practice, which suggests that
some optimizations after <a href="LoopInvariant">LoopInvariant</a> do enough simplifications
to introduce (new) loop invariant arguments.</p></div>
</div>
<div class="sect2">
<h3 id="_fixpoint_analysis">Fixpoint Analysis</h3>
<div class="paragraph"><p>However, the above analysis and transformation doesn’t cover the cases
where eliminating one common argument exposes the opportunity to
eliminate other common arguments. For example:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>L_1 ()
...
L_3 (x)
L_2 ()
...
L_3 (x)
L_3 (a)
...
L_5 (a)
L_4 ()
...
L_5 (x)
L_5 (b)
...</pre>
</div></div>
<div class="paragraph"><p>One pass of analysis and transformation would eliminate the argument
to <span class="monospaced">L_3</span> and rewrite the <span class="monospaced">L_5(a)</span> transfer to <span class="monospaced">L_5 (x)</span>, thereby
exposing the opportunity to eliminate the common argument to <span class="monospaced">L_5</span>.</p></div>
<div class="paragraph"><p>The interdependency the arguments to <span class="monospaced">L_3</span> and <span class="monospaced">L_5</span> suggest
performing some sort of fixed-point analysis. This analysis is
relatively simple; maintain</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo: Var.t -> VarLattice.t</pre>
</div></div>
<div class="paragraph"><p>where</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>VarLattice.t ~=~ Bot | Point of Var.t | Top</pre>
</div></div>
<div class="paragraph"><p>(but is implemented by the <a href="FlatLattice">FlatLattice</a> functor with a <span class="monospaced">lessThan</span>
list and <span class="monospaced">value ref</span> under the hood), initialized to <span class="monospaced">Bot</span>.</p></div>
<div class="ulist"><ul>
<li>
<p>
For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
<span class="monospaced">Function.t</span> args, then <span class="monospaced">VarLattice.<= (Point v, varInfo v)</span>
</p>
</li>
<li>
<p>
For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
formals of <span class="monospaced">L</span>}, then <span class="monospaced">VarLattice.<= (varInfo xi, varInfo ai)</span>.
</p>
</li>
<li>
<p>
For each block argument a used in an unknown context, then
<span class="monospaced">VarLattice.<= (Point a, varInfo a)</span>.
</p>
</li>
</ul></div>
<div class="paragraph"><p>Now, any block argument a such that <span class="monospaced">varInfo a = Point x</span> can be
optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>Now, with the last example, we introduce the ordering constraints:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>varInfo x <= varInfo a
varInfo a <= varInfo b
varInfo x <= varInfo b</pre>
</div></div>
<div class="paragraph"><p>Assuming that <span class="monospaced">varInfo x = Point x</span>, then we get <span class="monospaced">varInfo a = Point x</span>
and <span class="monospaced">varInfo b = Point x</span>, and we optimize the example as desired.</p></div>
<div class="paragraph"><p>But, that is a rather weak assumption. It’s quite possible for
<span class="monospaced">varInfo x = Top</span>. For example, consider:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>G_1 ()
... n = 1 ...
L_0 (n)
G_2 ()
... m = 2 ...
L_0 (m)
L_0 (x)
...
L_1 ()
...
L_3 (x)
L_2 ()
...
L_3 (x)
L_3 (a)
...
L_5(a)
L_4 ()
...
L_5(x)
L_5 (b)
...</pre>
</div></div>
<div class="paragraph"><p>Now <span class="monospaced">varInfo x = varInfo a = varInfo b = Top</span>. What went wrong here?
When <span class="monospaced">varInfo x</span> went to <span class="monospaced">Top</span>, it got propagated all the way through
to <span class="monospaced">a</span> and <span class="monospaced">b</span>, and prevented the elimination of any common arguments.
What we’d like to do instead is when <span class="monospaced">varInfo x</span> goes to <span class="monospaced">Top</span>,
propagate on <span class="monospaced">Point x</span> — we have no hope of eliminating <span class="monospaced">x</span>, but if
we hold <span class="monospaced">x</span> constant, then we have a chance of eliminating arguments
for which <span class="monospaced">x</span> is passed as an actual.</p></div>
</div>
<div class="sect2">
<h3 id="_dominator_analysis">Dominator Analysis</h3>
<div class="paragraph"><p>Does anyone see where this is going yet? Pausing for a little
thought, <a href="MatthewFluet">MatthewFluet</a> realized that he had once before tried
proposing this kind of "fix" to a fixed-point analysis — when we were
first investigating the <a href="Contify">Contify</a> optimization in light of John
Reppy’s CWS paper. Of course, that "fix" failed because it defined a
non-monotonic function and one couldn’t take the fixed point. But,
<a href="StephenWeeks">StephenWeeks</a> suggested a dominator based approach, and we were
able to show that, indeed, the dominator analysis subsumed both the
previous call based analysis and the cont based analysis. And, a
moment’s reflection reveals further parallels: when
<span class="monospaced">varInfo: Var.t -> Var.t option list ref</span>, we have something analogous
to the call analysis, and when <span class="monospaced">varInfo: Var.t -> VarLattice.t</span>, we
have something analogous to the cont analysis. Maybe there is
something analogous to the dominator approach (and therefore superior
to the previous analyses).</p></div>
<div class="paragraph"><p>And this turns out to be the case. Construct the graph <span class="monospaced">G</span> as follows:</p></div>
<div class="listingblock">
<div class="content monospaced">
<pre>nodes(G) = {Root} U Var.t
edges(G) = {Root -> v | v bound in a Statement.t or
in the Function.t args} U
{xi -> ai | L(x1, ..., xn) transfer where (a1, ..., an)
are the formals of L} U
{Root -> a | a is a block argument used in an unknown context}</pre>
</div></div>
<div class="paragraph"><p>Let <span class="monospaced">idom(x)</span> be the immediate dominator of <span class="monospaced">x</span> in <span class="monospaced">G</span> with root
<span class="monospaced">Root</span>. Now, any block argument a such that <span class="monospaced">idom(a) = x <> Root</span> can
be optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
<div class="paragraph"><p>Furthermore, experimental evidence suggests (and we are confident that
a formal presentation could prove) that the dominator analysis
subsumes the "syntactic" and "fixpoint" based analyses in this context
as well and that the dominator analysis gets "everything" in one go.</p></div>
</div>
<div class="sect2">
<h3 id="_final_thoughts">Final Thoughts</h3>
<div class="paragraph"><p>I must admit, I was rather surprised at this progression and final
result. At the outset, I never would have thought of a connection
between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> optimizations. They would seem
to be two completely different optimizations. Although, this may not
really be the case. As one of the reviewers of the ICFP paper said:</p></div>
<div class="quoteblock">
<div class="content">
<div class="paragraph"><p>I understand that such a form of CPS might be convenient in some
cases, but when we’re talking about analyzing code to detect that some
continuation is constant, I think it makes a lot more sense to make
all the continuation arguments completely explicit.</p></div>
<div class="paragraph"><p>I believe that making all the continuation arguments explicit will
show that the optimization can be generalized to eliminating constant
arguments, whether continuations or not.</p></div>
</div>
<div class="attribution">
</div></div>
<div class="paragraph"><p>What I think the common argument optimization shows is that the
dominator analysis does slightly better than the reviewer puts it: we
find more than just constant continuations, we find common
continuations. And I think this is further justified by the fact that
I have observed common argument eliminate some <span class="monospaced">env_X</span> arguments which
would appear to correspond to determining that while the closure being
executed isn’t constant it is at least the same as the closure being
passed elsewhere.</p></div>
<div class="paragraph"><p>At first, I was curious whether or not we had missed a bigger picture
with the dominator analysis. When we wrote the contification paper, I
assumed that the dominator analysis was a specialized solution to a
specialized problem; we never suggested that it was a technique suited
to a larger class of analyses. After initially finding a connection
between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> (and thinking that the only
connection was the technique), I wondered if the dominator technique
really was applicable to a larger class of analyses. That is still a
question, but after writing up the above, I’m suspecting that the
"real story" is that the dominator analysis is a solution to the
common argument optimization, and that the <a href="Contify">Contify</a> optimization is
specializing <a href="CommonArg">CommonArg</a> to the case of continuation arguments (with
a different transformation at the end). (Note, a whole-program,
inter-procedural common argument analysis doesn’t really make sense
(in our <a href="SSA">SSA</a> <a href="IntermediateLanguage">IntermediateLanguage</a>), because the only way of
passing values between functions is as arguments. (Unless of course
in the case that the common argument is also a constant argument, in
which case <a href="ConstantPropagation">ConstantPropagation</a> could lift it to a global.) The
inter-procedural <a href="Contify">Contify</a> optimization works out because there we
move the function to the argument.)</p></div>
<div class="paragraph"><p>Anyways, it’s still unclear to me whether or not the dominator based
approach solves other kinds of problems.</p></div>
</div>
<div class="sect2">
<h3 id="_phase_ordering">Phase Ordering</h3>
<div class="paragraph"><p>On the downside, the optimization doesn’t have a huge impact on
runtime, although it does predictably saved some code size. I stuck
it in the optimization sequence after <a href="Flatten">Flatten</a> and (the third round
of) <a href="LocalFlatten">LocalFlatten</a>, since it seems to me that we could have cases
where some components of a tuple used as an argument are common, but
the whole tuple isn’t. I think it makes sense to add it after
<a href="IntroduceLoops">IntroduceLoops</a> and <a href="LoopInvariant">LoopInvariant</a> (even though <a href="CommonArg">CommonArg</a>
get some things that <a href="LoopInvariant">LoopInvariant</a> gets, it doesn’t get all of
them). I also think that it makes sense to add it before
<a href="CommonSubexp">CommonSubexp</a>, since identifying variables could expose more common
subexpressions. I would think a similar thought applies to
<a href="RedundantTests">RedundantTests</a>.</p></div>
</div>
</div>
</div>
</div>
<div id="footnotes"><hr></div>
<div id="footer">
<div id="footer-text">
</div>
<div id="footer-badges">
</div>
</div>
</body>
</html>
|