File: index-scanning.html

package info (click to toggle)
pgadmin3 1.4.3-2
links: PTS
area: main
in suites: etch, etch-m68k
size: 29,796 kB
ctags: 10,758
sloc: cpp: 55,356; sh: 6,164; ansic: 1,520; makefile: 576; sql: 482; xml: 100; perl: 18
file content (105 lines) | stat: -rw-r--r-- 7,425 bytes
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>48.3.Index Scanning</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="indexam.html" title="Chapter48.Index Access Method Interface Definition">
<link rel="prev" href="index-functions.html" title="48.2.Index Access Method Functions">
<link rel="next" href="index-locking.html" title="48.4.Index Locking Considerations">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="index-scanning"></a>48.3.Index Scanning</h2></div></div></div>
<p>   In an index scan, the index access method is responsible for regurgitating
   the TIDs of all the tuples it has been told about that match the
   <em class="firstterm">scan keys</em>.  The access method is <span class="emphasis"><em>not</em></span> involved in
   actually fetching those tuples from the index's parent table, nor in
   determining whether they pass the scan's time qualification test or other
   conditions.
  </p>
<p>   A scan key is the internal representation of a <code class="literal">WHERE</code> clause of
   the form <em class="replaceable"><code>index_key</code></em> <em class="replaceable"><code>operator</code></em>
   <em class="replaceable"><code>constant</code></em>, where the index key is one of the columns of the
   index and the operator is one of the members of the operator class
   associated with that index column.  An index scan has zero or more scan
   keys, which are implicitly ANDed [mdash ] the returned tuples are expected
   to satisfy all the indicated conditions.
  </p>
<p>   The operator class may indicate that the index is <em class="firstterm">lossy</em> for a
   particular operator; this implies that the index scan will return all the
   entries that pass the scan key, plus possibly additional entries that do
   not.  The core system's index-scan machinery will then apply that operator
   again to the heap tuple to verify whether or not it really should be
   selected.  For non-lossy operators, the index scan must return exactly the
   set of matching entries, as there is no recheck.
  </p>
<p>   Note that it is entirely up to the access method to ensure that it
   correctly finds all and only the entries passing all the given scan keys.
   Also, the core system will simply hand off all the <code class="literal">WHERE</code>
   clauses that match the index keys and operator classes, without any
   semantic analysis to determine whether they are redundant or
   contradictory.  As an example, given
   <code class="literal">WHERE x &gt; 4 AND x &gt; 14</code> where <code class="literal">x</code> is a b-tree
   indexed column, it is left to the b-tree <code class="function">amrescan</code> function
   to realize that the first scan key is redundant and can be discarded.
   The extent of preprocessing needed during <code class="function">amrescan</code> will
   depend on the extent to which the index access method needs to reduce
   the scan keys to a &#8220;<span class="quote">normalized</span>&#8221; form.
  </p>
<p>   The <code class="function">amgettuple</code> function has a <code class="literal">direction</code> argument,
   which can be either <code class="literal">ForwardScanDirection</code> (the normal case)
   or  <code class="literal">BackwardScanDirection</code>.  If the first call after
   <code class="function">amrescan</code> specifies <code class="literal">BackwardScanDirection</code>, then the
   set of matching index entries is to be scanned back-to-front rather than in
   the normal front-to-back direction, so <code class="function">amgettuple</code> must return
   the last matching tuple in the index, rather than the first one as it
   normally would.  (This will only occur for access
   methods that advertise they support ordered scans by setting
   <code class="structname">pg_am</code>.<code class="structfield">amorderstrategy</code> nonzero.)  After the
   first call, <code class="function">amgettuple</code> must be prepared to advance the scan in
   either direction from the most recently returned entry.
  </p>
<p>   The access method must support &#8220;<span class="quote">marking</span>&#8221; a position in a scan
   and later returning to the marked position.  The same position may be
   restored multiple times.  However, only one position need be remembered
   per scan; a new <code class="function">ammarkpos</code> call overrides the previously
   marked position.
  </p>
<p>   Both the scan position and the mark position (if any) must be maintained
   consistently in the face of concurrent insertions or deletions in the
   index.  It is OK if a freshly-inserted entry is not returned by a scan that
   would have found the entry if it had existed when the scan started, or for
   the scan to return such an entry upon rescanning or backing
   up even though it had not been returned the first time through.  Similarly,
   a concurrent delete may or may not be reflected in the results of a scan.
   What is important is that insertions or deletions not cause the scan to
   miss or multiply return entries that were not themselves being inserted or
   deleted.  (For an index type that does not set
   <code class="structname">pg_am</code>.<code class="structfield">amconcurrent</code>, it is sufficient to
   handle these cases for insertions or deletions performed by the same
   backend that's doing the scan.  But when <code class="structfield">amconcurrent</code> is
   true, insertions or deletions from other backends must be handled as well.)
  </p>
<p>   Instead of using <code class="function">amgettuple</code>, an index scan can be done with 
   <code class="function">amgetmulti</code> to fetch multiple tuples per call.  This can be
   noticeably more efficient than <code class="function">amgettuple</code> because it allows
   avoiding lock/unlock cycles within the access method.  In principle
   <code class="function">amgetmulti</code> should have the same effects as repeated
   <code class="function">amgettuple</code> calls, but we impose several restrictions to
   simplify matters.  In the first place, <code class="function">amgetmulti</code> does not
   take a <code class="literal">direction</code> argument, and therefore it does not support
   backwards scan nor intrascan reversal of direction.  The access method
   need not support marking or restoring scan positions during an
   <code class="function">amgetmulti</code> scan, either.  (These restrictions cost little
   since it would be difficult to use these features in an
   <code class="function">amgetmulti</code> scan anyway: adjusting the caller's buffered
   list of TIDs would be complex.)  Finally, <code class="function">amgetmulti</code> does
   not guarantee any locking of the returned tuples, with implications
   spelled out in <a href="index-locking.html" title="48.4.Index Locking Considerations">Section48.4, &#8220;Index Locking Considerations&#8221;</a>.
  </p>
</div></body>
</html>