1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93
|
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>47.3.Genetic Query Optimization (GEQO) in PostgreSQL</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="geqo.html" title="Chapter47.Genetic Query Optimizer">
<link rel="prev" href="geqo-intro2.html" title="47.2.Genetic Algorithms">
<link rel="next" href="geqo-biblio.html" title="47.4.Further Reading">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="geqo-pg-intro"></a>47.3.Genetic Query Optimization (<acronym class="acronym">GEQO</acronym>) in PostgreSQL</h2></div></div></div>
<p> The <acronym class="acronym">GEQO</acronym> module approaches the query
optimization problem as though it were the well-known traveling salesman
problem (<acronym class="acronym">TSP</acronym>).
Possible query plans are encoded as integer strings. Each string
represents the join order from one relation of the query to the next.
For example, the join tree
</p>
<pre class="literallayout"> /\
/\ 2
/\ 3
4 1</pre>
<p>
is encoded by the integer string '4-1-3-2',
which means, first join relation '4' and '1', then '3', and
then '2', where 1, 2, 3, 4 are relation IDs within the
<span class="productname">PostgreSQL</span> optimizer.
</p>
<p> Parts of the <acronym class="acronym">GEQO</acronym> module are adapted from D. Whitley's Genitor
algorithm.
</p>
<p> Specific characteristics of the <acronym class="acronym">GEQO</acronym>
implementation in <span class="productname">PostgreSQL</span>
are:
</p>
<div class="itemizedlist"><ul type="bullet" compact>
<li style="list-style-type: disc"><p> Usage of a <em class="firstterm">steady state</em> <acronym class="acronym">GA</acronym> (replacement of the least fit
individuals in a population, not whole-generational replacement)
allows fast convergence towards improved query plans. This is
essential for query handling with reasonable time;
</p></li>
<li style="list-style-type: disc"><p> Usage of <em class="firstterm">edge recombination crossover</em>
which is especially suited to keep edge losses low for the
solution of the <acronym class="acronym">TSP</acronym> by means of a
<acronym class="acronym">GA</acronym>;
</p></li>
<li style="list-style-type: disc"><p> Mutation as genetic operator is deprecated so that no repair
mechanisms are needed to generate legal <acronym class="acronym">TSP</acronym> tours.
</p></li>
</ul></div>
<p>
</p>
<p> The <acronym class="acronym">GEQO</acronym> module allows
the <span class="productname">PostgreSQL</span> query optimizer to
support large join queries effectively through
non-exhaustive search.
</p>
<div class="sect2" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="geqo-future"></a>47.3.1.Future Implementation Tasks for
<span class="productname">PostgreSQL</span> <acronym class="acronym">GEQO</acronym></h3></div></div></div>
<p> Work is still needed to improve the genetic algorithm parameter
settings.
In file <code class="filename">src/backend/optimizer/geqo/geqo_main.c</code>,
routines
<code class="function">gimme_pool_size</code> and <code class="function">gimme_number_generations</code>,
we have to find a compromise for the parameter settings
to satisfy two competing demands:
</p>
<div class="itemizedlist"><ul type="disc" compact>
<li><p> Optimality of the query plan
</p></li>
<li><p> Computing time
</p></li>
</ul></div>
<p>
</p>
<p> At a more basic level, it is not clear that solving query optimization
with a GA algorithm designed for TSP is appropriate. In the TSP case,
the cost associated with any substring (partial tour) is independent
of the rest of the tour, but this is certainly not true for query
optimization. Thus it is questionable whether edge recombination
crossover is the most effective mutation procedure.
</p>
</div>
</div></body>
</html>
|