File: hash_conf.html

package info (click to toggle)
db5.3 5.3.28%2Bdfsg1-0.5
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 158,360 kB
  • sloc: ansic: 448,411; java: 111,824; tcl: 80,544; sh: 44,326; cs: 33,697; cpp: 21,604; perl: 14,557; xml: 10,799; makefile: 4,077; yacc: 1,003; awk: 965; sql: 801; erlang: 342; python: 216; php: 24; asm: 14
file content (148 lines) | stat: -rw-r--r-- 7,131 bytes parent folder | download | duplicates (8)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Hash access method specific configuration</title>
    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
    <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
    <link rel="up" href="am_conf.html" title="Chapter 2.  Access Method Configuration" />
    <link rel="prev" href="bt_conf.html" title="Btree access method specific configuration" />
    <link rel="next" href="heap_conf.html" title="Heap access method specific configuration" />
  </head>
  <body>
    <div xmlns="" class="navheader">
      <div class="libver">
        <p>Library Version 11.2.5.3</p>
      </div>
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">Hash access method specific configuration</th>
        </tr>
        <tr>
          <td width="20%" align="left"><a accesskey="p" href="bt_conf.html">Prev</a> </td>
          <th width="60%" align="center">Chapter 2. 
		Access Method Configuration
        </th>
          <td width="20%" align="right"> <a accesskey="n" href="heap_conf.html">Next</a></td>
        </tr>
      </table>
      <hr />
    </div>
    <div class="sect1" lang="en" xml:lang="en">
      <div class="titlepage">
        <div>
          <div>
            <h2 class="title" style="clear: both"><a id="hash_conf"></a>Hash access method specific configuration</h2>
          </div>
        </div>
      </div>
      <div class="toc">
        <dl>
          <dt>
            <span class="sect2">
              <a href="hash_conf.html#am_conf_h_ffactor">Page fill factor</a>
            </span>
          </dt>
          <dt>
            <span class="sect2">
              <a href="hash_conf.html#am_conf_h_hash">Specifying a database hash</a>
            </span>
          </dt>
          <dt>
            <span class="sect2">
              <a href="hash_conf.html#am_conf_h_nelem">Hash table size</a>
            </span>
          </dt>
        </dl>
      </div>
      <p>
    There are a series of configuration tasks which you can perform when
    using the Hash access method.  They are described in the following sections.
</p>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="am_conf_h_ffactor"></a>Page fill factor</h3>
            </div>
          </div>
        </div>
        <p>The density, or page fill factor, is an approximation of the number of
keys allowed to accumulate in any one bucket, determining when the hash
table grows or shrinks.  If you know the average sizes of the keys and
data in your data set, setting the fill factor can enhance performance.
A reasonable rule to use to compute fill factor is:</p>
        <pre class="programlisting">(pagesize - 32) / (average_key_size + average_data_size + 8)</pre>
        <p>The desired density within the hash table can be specified by calling
the <a href="../api_reference/C/dbset_h_ffactor.html" class="olink">DB-&gt;set_h_ffactor()</a> method.  If no density is specified, one will
be selected dynamically as pages are filled.</p>
      </div>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="am_conf_h_hash"></a>Specifying a database hash</h3>
            </div>
          </div>
        </div>
        <p>The database hash determines in which bucket a particular key will reside.
The goal of hashing keys is to distribute keys equally across the database
pages, therefore it is important that the hash function work well with
the specified keys so that the resulting bucket usage is relatively
uniform.  A hash function that does not work well can effectively turn
into a sequential list.</p>
        <p>No hash performs equally well on all possible data sets.  It is possible
that applications may find that the default hash function performs poorly
with a particular set of keys.  The distribution resulting from the hash
function can be checked using the <a href="../api_reference/C/db_stat.html" class="olink">db_stat</a> utility.  By comparing the
number of hash buckets and the number of keys, one can decide if the entries
are hashing in a well-distributed manner.</p>
        <p>The hash function for the hash table can be specified by calling the
<a href="../api_reference/C/dbset_h_hash.html" class="olink">DB-&gt;set_h_hash()</a> method.  If no hash function is specified, a default
function will be used.  Any application-specified hash function must
take a reference to a <a href="../api_reference/C/db.html" class="olink">DB</a> object, a pointer to a byte string and
its length, as arguments and return an unsigned, 32-bit hash value.</p>
      </div>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="am_conf_h_nelem"></a>Hash table size</h3>
            </div>
          </div>
        </div>
        <p>When setting up the hash database, knowing the expected number of elements
that will be stored in the hash table is useful.  This value can be used
by the Hash access method implementation to more accurately construct the
necessary number of buckets that the database will eventually require.</p>
        <p>The anticipated number of elements in the hash table can be specified by
calling the <a href="../api_reference/C/dbset_h_nelem.html" class="olink">DB-&gt;set_h_nelem()</a> method.  If not specified, or set too low,
hash tables will expand gracefully as keys are entered, although a slight
performance degradation may be noticed.  In order for the estimated number
of elements to be a useful value to Berkeley DB, the <a href="../api_reference/C/dbset_h_ffactor.html" class="olink">DB-&gt;set_h_ffactor()</a> method
must also be called to set the page fill factor.</p>
      </div>
    </div>
    <div class="navfooter">
      <hr />
      <table width="100%" summary="Navigation footer">
        <tr>
          <td width="40%" align="left"><a accesskey="p" href="bt_conf.html">Prev</a> </td>
          <td width="20%" align="center">
            <a accesskey="u" href="am_conf.html">Up</a>
          </td>
          <td width="40%" align="right"> <a accesskey="n" href="heap_conf.html">Next</a></td>
        </tr>
        <tr>
          <td width="40%" align="left" valign="top">Btree access method specific configuration </td>
          <td width="20%" align="center">
            <a accesskey="h" href="index.html">Home</a>
          </td>
          <td width="40%" align="right" valign="top"> Heap access method specific configuration</td>
        </tr>
      </table>
    </div>
  </body>
</html>