File: cdb.txt

package info (click to toggle)
rust-cdb 0.6.0-1
  • links: PTS, VCS
  • area: main
  • in suites: experimental
  • size: 248 kB
  • sloc: makefile: 10; sh: 9
file content (38 lines) | stat: -rw-r--r-- 1,566 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
A structure for constant databases
19960914
Copyright 1996
D. J. Bernstein, djb@pobox.com

A cdb is an associative array: it maps strings (``keys'') to strings
(``data'').

A cdb contains 256 pointers to linearly probed open hash tables. The
hash tables contain pointers to (key,data) pairs. A cdb is stored in
a single file on disk:

    +----------------+---------+-------+-------+-----+---------+
    | p0 p1 ... p255 | records | hash0 | hash1 | ... | hash255 |
    +----------------+---------+-------+-------+-----+---------+

Each of the 256 initial pointers states a position and a length. The
position is the starting byte position of the hash table. The length
is the number of slots in the hash table.

Records are stored sequentially, without special alignment. A record
states a key length, a data length, the key, and the data.

Each hash table slot states a hash value and a byte position. If the
byte position is 0, the slot is empty. Otherwise, the slot points to
a record whose key has that hash value.

Positions, lengths, and hash values are 32-bit quantities, stored in
little-endian form in 4 bytes. Thus a cdb must fit into 4 gigabytes.

A record is located as follows. Compute the hash value of the key in
the record. The hash value modulo 256 is the number of a hash table.
The hash value divided by 256, modulo the length of that table, is a
slot number. Probe that slot, the next higher slot, and so on, until
you find the record or run into an empty slot.

The cdb hash function is ``h = ((h << 5) + h) ^ c'', with a starting
hash of 5381.