1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
|
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>48.6.Index Cost Estimation Functions</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="indexam.html" title="Chapter48.Index Access Method Interface Definition">
<link rel="prev" href="index-unique-checks.html" title="48.5.Index Uniqueness Checks">
<link rel="next" href="gist.html" title="Chapter49.GiST Indexes">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="index-cost-estimation"></a>48.6.Index Cost Estimation Functions</h2></div></div></div>
<p> The amcostestimate function is given a list of WHERE clauses that have
been determined to be usable with the index. It must return estimates
of the cost of accessing the index and the selectivity of the WHERE
clauses (that is, the fraction of parent-table rows that will be
retrieved during the index scan). For simple cases, nearly all the
work of the cost estimator can be done by calling standard routines
in the optimizer; the point of having an amcostestimate function is
to allow index access methods to provide index-type-specific knowledge,
in case it is possible to improve on the standard estimates.
</p>
<p> Each amcostestimate function must have the signature:
</p>
<pre class="programlisting">void
amcostestimate (PlannerInfo *root,
IndexOptInfo *index,
List *indexQuals,
Cost *indexStartupCost,
Cost *indexTotalCost,
Selectivity *indexSelectivity,
double *indexCorrelation);</pre>
<p>
The first four parameters are inputs:
</p>
<div class="variablelist"><dl>
<dt><span class="term">root</span></dt>
<dd><p> The planner's information about the query being processed.
</p></dd>
<dt><span class="term">index</span></dt>
<dd><p> The index being considered.
</p></dd>
<dt><span class="term">indexQuals</span></dt>
<dd><p> List of index qual clauses (implicitly ANDed);
a NIL list indicates no qualifiers are available.
Note that the list contains expression trees, not ScanKeys.
</p></dd>
</dl></div>
<p>
</p>
<p> The last four parameters are pass-by-reference outputs:
</p>
<div class="variablelist"><dl>
<dt><span class="term">*indexStartupCost</span></dt>
<dd><p> Set to cost of index start-up processing
</p></dd>
<dt><span class="term">*indexTotalCost</span></dt>
<dd><p> Set to total cost of index processing
</p></dd>
<dt><span class="term">*indexSelectivity</span></dt>
<dd><p> Set to index selectivity
</p></dd>
<dt><span class="term">*indexCorrelation</span></dt>
<dd><p> Set to correlation coefficient between index scan order and
underlying table's order
</p></dd>
</dl></div>
<p>
</p>
<p> Note that cost estimate functions must be written in C, not in SQL or
any available procedural language, because they must access internal
data structures of the planner/optimizer.
</p>
<p> The index access costs should be computed in the units used by
<code class="filename">src/backend/optimizer/path/costsize.c</code>: a sequential
disk block fetch has cost 1.0, a nonsequential fetch has cost
<code class="varname">random_page_cost</code>, and the cost of processing one index row
should usually be taken as <code class="varname">cpu_index_tuple_cost</code>. In addition,
an appropriate multiple of <code class="varname">cpu_operator_cost</code> should be charged
for any comparison operators invoked during index processing (especially
evaluation of the indexQuals themselves).
</p>
<p> The access costs should include all disk and CPU costs associated with
scanning the index itself, but <span class="emphasis"><em>not</em></span> the costs of retrieving or
processing the parent-table rows that are identified by the index.
</p>
<p> The “<span class="quote">start-up cost</span>” is the part of the total scan cost that must be expended
before we can begin to fetch the first row. For most indexes this can
be taken as zero, but an index type with a high start-up cost might want
to set it nonzero.
</p>
<p> The indexSelectivity should be set to the estimated fraction of the parent
table rows that will be retrieved during the index scan. In the case
of a lossy index, this will typically be higher than the fraction of
rows that actually pass the given qual conditions.
</p>
<p> The indexCorrelation should be set to the correlation (ranging between
-1.0 and 1.0) between the index order and the table order. This is used
to adjust the estimate for the cost of fetching rows from the parent
table.
</p>
<div class="procedure">
<a name="id843817"></a><p class="title"><b>Cost Estimation</b></p>
<p> A typical cost estimator will proceed as follows:
</p>
<ol type="1">
<li>
<p> Estimate and return the fraction of parent-table rows that will be visited
based on the given qual conditions. In the absence of any index-type-specific
knowledge, use the standard optimizer function <code class="function">clauselist_selectivity()</code>:
</p>
<pre class="programlisting">*indexSelectivity = clauselist_selectivity(root, indexQuals,
index->rel->relid, JOIN_INNER);</pre>
<p>
</p>
</li>
<li><p> Estimate the number of index rows that will be visited during the
scan. For many index types this is the same as indexSelectivity times
the number of rows in the index, but it might be more. (Note that the
index's size in pages and rows is available from the IndexOptInfo struct.)
</p></li>
<li><p> Estimate the number of index pages that will be retrieved during the scan.
This might be just indexSelectivity times the index's size in pages.
</p></li>
<li>
<p> Compute the index access cost. A generic estimator might do this:
</p>
<pre class="programlisting"> /*
* Our generic assumption is that the index pages will be read
* sequentially, so they have cost 1.0 each, not random_page_cost.
* Also, we charge for evaluation of the indexquals at each index row.
* All the costs are assumed to be paid incrementally during the scan.
*/
cost_qual_eval(&index_qual_cost, indexQuals);
*indexStartupCost = index_qual_cost.startup;
*indexTotalCost = numIndexPages +
(cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples;</pre>
<p>
</p>
</li>
<li><p> Estimate the index correlation. For a simple ordered index on a single
field, this can be retrieved from pg_statistic. If the correlation
is not known, the conservative estimate is zero (no correlation).
</p></li>
</ol>
</div>
<p> Examples of cost estimator functions can be found in
<code class="filename">src/backend/utils/adt/selfuncs.c</code>.
</p>
</div></body>
</html>
|