1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303
|
DBM Libraries for use with Exim
-------------------------------
Background
----------
Exim uses direct-access (so-called "dbm") files for a number of different
purposes. These are files arranged so that the data they contain is indexed and
can quickly be looked up by quoting an appropriate key. They are used as
follows:
. Exim keeps its "hints" databases in dbm files.
. The configuration can specify that certain things (e.g. aliases) be looked up
in dbm files.
. The configuration can contain expansion strings that involve lookups in dbm
files.
. The filter commands "mail" and "vacation" have a facility for replying only
once to each incoming address. The record of which addresses have already
received replies is kept in a dbm file.
DBM Libraries
-------------
The original library that provided the dbm facility in Unix was called "dbm".
This seems to have been superseded quite some time ago by a new version called
"ndbm" which permits several dbm files to be open at once. Several operating
systems, including those from Sun, contain ndbm as standard.
A number of alternative libraries also exist, the most common of which seem to
be gdbm and Berkeley DB. Release 1.85 of the latter (just called DB hereafter)
has been around for some time, and various releases 2.x are now appearing.
However, there are major differences in implementation and interface between
the DB 1.x and 2.x releases, and they are best considered as two independent
dbm libraries.
Some Linux releases contain gdbm as standard, and some BSD versions of Unix
include DB 1.85. No doubt in due course DB 2.x will find its way into
some OS distributions. All of the non-ndbm libraries contain compatibility
interfaces so that programs written to call the ndbm functions should, in
theory, work with them, but there are some potential pitfalls which have
caught out Exim users in the past.
Exim has been tested with all of these libraries, in various different modes in
some cases, and is believed to work with all of them if it and they are
properly configured.
I have considered the possibility of calling different dbm libraries for
different functions from a single Exim binary. However, because all the
libraries provide ndbm compatibility interfaces (and therefore the same
function names) it would require a lot of complicated, error-prone trickery to
achieve this. Exim therefore uses a single library for all its dbm activities.
Locking
-------
The configuration option DB_LOCK_TIMEOUT controls how long Exim waits to get a
lock on a hints database. From version 1.80 onwards, Exim does not attempt to
take out a lock on an actual database file (this caused problems in the past).
Instead, it takes out an fcntl() lock on a separate file whose name ends in
".lockfile". This ensures that Exim has exclusive access to the database before
even attempting to open it. Exim creates the lock file the first time it needs
it. It should never be removed.
Main Pitfall
------------
In the absence of any special configuration options, Exim uses the ndbm set of
functions to control its dbm databases. This should work with any of the dbm
libraries because those that are not ndbm have compatibility interfaces.
However, there is one awful pitfall:
Exim #includes a header file called ndbm.h which defines the functions and the
interface data block; gdbm and DB 1.x provide their own versions of this header
file, though DB 2.x does not. If it should happen that the wrong version of
nbdm.h is seen by Exim, then it may compile without error, but fail to operate
correctly at runtime.
This situation can easily arise when more than one dbm library is installed on
a single host. For example, if you decide to use DB 1.x on a system where gdbm
is the standard library, unless you are careful in setting up the include
directories for Exim, it may see gdbm's ndbm.h file instead of DB's. The
situation is even worse with DB 2.x, which doesn't provide an ndbm.h file at
all (see below).
One way out of this for either version of DB is to configure Exim to call it in
its native mode instead of via the ndbm compatibility interface, thus avoiding
the use of ndbm.h. This is done by setting the USE_DB configuration option.
This option is not available for gdbm, which can be used by Exim only in its
ndbm compatibiity mode.
Performance
-----------
It would be nice if somebody produced some statistics for the performance of
the various libraries for the kinds of operation that Exim does. I have done a
number of very crude tests, so crude that I'm not prepared to publish any
actual figures. However, at the time I did them they suggested that
. DB 2.x is not a good performer in the way Exim uses it. There is an awful
lot of "big database" technology in DB 2.x, including locks and transactions
and the like, which is all overkill for what Exim wants to do. (There have
been later releases of DB 2.x; things may now have changed.)
. Of the other 3, gdbm came last and ndbm second, leaving DB 1.x as probably
the best library to use from a performance point of view. However,
maintenance of DB 1.x is being phased out, and already there are problems
compiling it on some systems. DB 2.x is the currently supported DB library.
I stress that the tests I did were very crude, and on a small, almost empty
database. More realistic tests might change the balance. In any case, dbm
performance is probably only a serious issue on heavily loaded systems that are
mostly dedicated to email. If you have a working DBM library on your system,
there seems little point in not using it. If you have to install something, the
DB maintainers would prefer it not to be DB 1.x.
NDBM
----
The ndbm library holds its data in two files, with extensions .dir and .pag.
This makes atomic updating of, for example, alias files, difficult, because
simple renaming cannot be used without some risk. However, if your system has
ndbm installed, Exim should compile and run without any problems.
GDBM
----
The gdbm library, when called via the ndbm compatibility interface, makes two
hard links to a single file, with extensions .dir and .pag. The compatibility
interface is the only way gdbm can be called from Exim. The native interface
returns data in malloc'd store, and Exim is not set up to handle this. (Other
libraries return pointers to data that must then be copied if necessary). As
gdbm's performance is not brilliant (and there were some rumours of its demise)
I don't think it's worth putting effort into converting Exim to use the native
interface.
As mentioned above, gdbm provides its own version of the ndbm.h header, and you
must ensure that this is seen by Exim rather than any other version. This is
not likely to be a problem if gdbm is the only dbm library on your system.
The gdbm library does its own locking of the single file that it uses. From
version 1.80 onwards, Exim locks on an entirely separate file before accessing
a hints database, so gdbm's locking should always succeed.
Berkeley DB 1.8x
----------------
1.85 is the most widespread DB 1.x release; there is also a 1.86 bug-fix
release, but the belief is that the bugs it fixes will not affect Exim.
However, maintenance for 1.x releases is being phased out.
This dbm library can be called by Exim in one of two ways: via the ndbm
compatibility interface, or via its own native interface. There are two
advantages to doing the latter: (1) you don't run the risk of Exim's seeing the
"wrong" version of the ndbm.h header, as described above, and (2) the
performace is better. It is therefore recommended that you set USE_DB=yes in an
appropriate Local/Makefile-xxx file. (If you are compiling for just one OS, it
can go in Local/Makefile itself.)
When called via the compatibility interface, DB 1.x creates a single file with
a .db extension. When called via its native interface, no extension is added to
the file name handed to it.
DB 1.x does not do any locking of its own.
Berkeley DB 2.x
---------------
DB 2.x was released in 1997. It is a major re-implementation and its native
interface is incompatible with DB 1.x, though a compatibility interface was
introduced in DB 2.1.0, and there is also an ndbm.h compatibility interface.
Like 1.x, it can be called from Exim via the ndbm compatibility interface or
via its native interface, and once again setting USE_DB in order to get the
native interface is recommended. If USE_DB is *not* set, then you will have to
provide a suitable version of ndbm.h, because one does not come with the DB 2.x
distribution. A suitable version is:
/*************************************************
* ndbm.h header for DB 2.x *
*************************************************/
/* This header should replace any other version of ndbm.h when Berkeley DB
version 2.x is in use via the ndbm compatibility interface. Otherwise, any
extant version of ndbm.h may cause programs to misbehave. There doesn't seem
to be a version of ndbm.h supplied with DB 2.x, so I made this for myself.
Philip Hazel 12/Jun/97
*/
#define DB_DBM_HSEARCH
#include <db.h>
/* End */
When called via the compatibility interface, DB 2.x creates a single file with
a .db extension. When called via its native interface, no extension is added to
the file name handed to it.
DB 2.x does not do any automatic locking of its own; it does have a set of
functions for various forms of locking, but Exim does not use them.
As mentioned above, DB 2.x is not necessarily the best library to use with Exim
because doing so is in the nature of using a sledgehammer to crack a nut.
However, on systems that already have DB 2.x installed, it might be easier than
having to install another dbm library, and if the load is light, the
performance difference probably won't matter at all.
Testing Exim's dbm handling
---------------------------
Because there have been problems with dbm file locking in the past, I built
some testing code for Exim's dbm functions. This is very much a lash-up, but it
is documented here so that anybody who wants to check that their configuration
is locking properly can do so. Now that Exim does the locking on an entirely
separate file, locking problems are much less likely, but this code still
exists, just in case. Proceed as follows:
. Build Exim in the normal way. This ensures that all the makesfiles etc. get
set up.
. From within the build directory, obey "make test_dbfn". This makes a binary
file called test_dbfn. If you are experimenting with different configurations
you *must* do "make makefile" after changing anything, before obeying "make
test_dbfn" again, because the make target for test_dbfn isn't integrated
with the making of the makefile.
. Identify a scratch directory where you have write access. Create a sub-
directory called "db" in the scratch directory.
. Type the command "test_dbfn <scratch-directory>". This will output some
general information such as
Exim's db functions tester: interface type is db (v2)
DBM library: Berkeley DB: Sleepycat Software: DB 2.1.0: (6/13/97)
USE_DB is defined
It then says
Test the functions
>
. At this point you can type commands to open a dbm file and read and write
data in it. First type the command "open <name>", e.g. "open junk". The
response should look like this
opened DB file <scratch-directory>/db/junk: flags=102
Locked
opened 0
>
The tester will have created a dbm file within the db directory of the
scratch directory. It will also have created a file with the extension
".lockfile" in the same directory. Unlike Exim itself, it will not create
the db directory for itself if it does not exist.
. To test the locking, don't type anything more for the moment. You now need to
set up another process running the same test_dbfn command, e.g. from a
different logon to the same host. This time, when you attempt to open the
file it should fail after a minute with a timeout error because it is
already in use.
. If the second process doesn't produce any error message, but gets back to the
> prompt, then the locking is not working properly.
. You can check that the second process gets the lock when the first process
releases it by exiting from the first process with ^D, q, or quit; or by
typing the command "close".
. There are some other commands available that are not related to locking:
write <key> <data>
e.g.
write abcde the quick brown fox
writes a record to the database,
read <key>
delete <key>
read and delete a record, respectively, and
scan
scans the entire database. Note that the database is purely for testing the
dbm functions. It is *not* one of Exim's regular databases, and you should
not try running this testing program on any of Exim's real database
files.
Philip Hazel
December 1997
|