File: index-locking.html

package info (click to toggle)
pgadmin3 1.4.3-2
links: PTS
area: main
in suites: etch, etch-m68k
size: 29,796 kB
ctags: 10,758
sloc: cpp: 55,356; sh: 6,164; ansic: 1,520; makefile: 576; sql: 482; xml: 100; perl: 18
file content (113 lines) | stat: -rw-r--r-- 7,265 bytes
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>48.4.Index Locking Considerations</title>
<link rel="stylesheet" href="stylesheet.css" type="text/css">
<link rev="made" href="pgsql-docs@postgresql.org">
<meta name="generator" content="DocBook XSL Stylesheets V1.70.0">
<link rel="start" href="index.html" title="PostgreSQL 8.1.4 Documentation">
<link rel="up" href="indexam.html" title="Chapter48.Index Access Method Interface Definition">
<link rel="prev" href="index-scanning.html" title="48.3.Index Scanning">
<link rel="next" href="index-unique-checks.html" title="48.5.Index Uniqueness Checks">
<link rel="copyright" href="ln-legalnotice.html" title="Legal Notice">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="sect1" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="index-locking"></a>48.4.Index Locking Considerations</h2></div></div></div>
<p>   An index access method can choose whether it supports concurrent updates
   of the index by multiple processes.  If the method's
   <code class="structname">pg_am</code>.<code class="structfield">amconcurrent</code> flag is true, then
   the core <span class="productname">PostgreSQL</span> system obtains
   <code class="literal">AccessShareLock</code> on the index during an index scan, and
   <code class="literal">RowExclusiveLock</code> when updating the index.  Since these lock
   types do not conflict, the access method is responsible for handling any
   fine-grained locking it may need.  An exclusive lock on the index as a whole
   will be taken only during index creation, destruction, or
   <code class="literal">REINDEX</code>.  When <code class="structfield">amconcurrent</code> is false,
   <span class="productname">PostgreSQL</span> still obtains
   <code class="literal">AccessShareLock</code> during index scans, but it obtains
   <code class="literal">AccessExclusiveLock</code> during any update.  This ensures that
   updaters have sole use of the index.  Note that this implicitly assumes
   that index scans are read-only; an access method that might modify the
   index during a scan will still have to do its own locking to handle the
   case of concurrent scans.
  </p>
<p>   Recall that a backend's own locks never conflict; therefore, even a
   non-concurrent index type must be prepared to handle the case where
   a backend is inserting or deleting entries in an index that it is itself
   scanning.  (This is of course necessary to support an <code class="command">UPDATE</code>
   that uses the index to find the rows to be updated.)
  </p>
<p>   Building an index type that supports concurrent updates usually requires
   extensive and subtle analysis of the required behavior.  For the b-tree
   and hash index types, you can read about the design decisions involved in
   <code class="filename">src/backend/access/nbtree/README</code> and
   <code class="filename">src/backend/access/hash/README</code>.
  </p>
<p>   Aside from the index's own internal consistency requirements, concurrent
   updates create issues about consistency between the parent table (the
   <em class="firstterm">heap</em>) and the index.  Because
   <span class="productname">PostgreSQL</span> separates accesses 
   and updates of the heap from those of the index, there are windows in
   which the index may be inconsistent with the heap.  We handle this problem
   with the following rules:

    </p>
<div class="itemizedlist"><ul type="disc">
<li><p>       A new heap entry is made before making its index entries.  (Therefore
       a concurrent index scan is likely to fail to see the heap entry.
       This is okay because the index reader would be uninterested in an
       uncommitted row anyway.  But see <a href="index-unique-checks.html" title="48.5.Index Uniqueness Checks">Section48.5, &#8220;Index Uniqueness Checks&#8221;</a>.)
      </p></li>
<li><p>       When a heap entry is to be deleted (by <code class="command">VACUUM</code>), all its
       index entries must be removed first.
      </p></li>
<li><p>       For concurrent index types, an index scan must maintain a pin
       on the index page holding the item last returned by
       <code class="function">amgettuple</code>, and <code class="function">ambulkdelete</code> cannot delete
       entries from pages that are pinned by other backends.  The need
       for this rule is explained below.
      </p></li>
</ul></div>
<p>

   If an index is concurrent then it is possible for an index reader to
   see an index entry just before it is removed by <code class="command">VACUUM</code>, and
   then to arrive at the corresponding heap entry after that was removed by
   <code class="command">VACUUM</code>.  (With a nonconcurrent index, this is not possible
   because of the conflicting index-level locks that will be taken out.)
   This creates no serious problems if that item
   number is still unused when the reader reaches it, since an empty
   item slot will be ignored by <code class="function">heap_fetch()</code>.  But what if a
   third backend has already re-used the item slot for something else?
   When using an MVCC-compliant snapshot, there is no problem because
   the new occupant of the slot is certain to be too new to pass the
   snapshot test.  However, with a non-MVCC-compliant snapshot (such as
   <code class="literal">SnapshotNow</code>), it would be possible to accept and return
   a row that does not in fact match the scan keys.  We could defend
   against this scenario by requiring the scan keys to be rechecked
   against the heap row in all cases, but that is too expensive.  Instead,
   we use a pin on an index page as a proxy to indicate that the reader
   may still be &#8220;<span class="quote">in flight</span>&#8221; from the index entry to the matching
   heap entry.  Making <code class="function">ambulkdelete</code> block on such a pin ensures
   that <code class="command">VACUUM</code> cannot delete the heap entry before the reader
   is done with it.  This solution costs little in run time, and adds blocking
   overhead only in the rare cases where there actually is a conflict.
  </p>
<p>   This solution requires that index scans be &#8220;<span class="quote">synchronous</span>&#8221;: we have
   to fetch each heap tuple immediately after scanning the corresponding index
   entry.  This is expensive for a number of reasons.  An
   &#8220;<span class="quote">asynchronous</span>&#8221; scan in which we collect many TIDs from the index,
   and only visit the heap tuples sometime later, requires much less index
   locking overhead and may allow a more efficient heap access pattern.
   Per the above analysis, we must use the synchronous approach for
   non-MVCC-compliant snapshots, but an asynchronous scan is workable
   for a query using an MVCC snapshot.
  </p>
<p>   In an <code class="function">amgetmulti</code> index scan, the access method need not
   guarantee to keep an index pin on any of the returned tuples.  (It would be
   impractical to pin more than the last one anyway.)  Therefore
   it is only safe to use such scans with MVCC-compliant snapshots.
  </p>
</div></body>
</html>