1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133
|
<! "@(#)throughput.so 10.7 (Sleepycat) 12/1/98">
<!Copyright 1997, 1998 by Sleepycat Software, Inc. All rights reserved.>
<html>
<body bgcolor=white>
<head>
<title>Berkeley DB Reference Guide: Transaction Protected Applications</title>
<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btr
ee,hash,hashing,transaction,transactions,locking,logging,access method,access me
thods,java,C,C++">
</head>
<h3>Berkeley DB Reference Guide: Transaction Protected Applications</h3>
<p>
<h1 align=center>Transaction throughput</h1>
<p>
Generally, the speed of a database system is measured by the transaction
throughput, expressed as the number of transactions per second.
<p>
The two gating factors for Berkeley DB performance in a transactional system
are usually the underlying database files and the log file, both because
they require disk I/O, which is incredibly expensive relative to other
resources, e.g., CPU.
<p>
In the worst case scenario:
<ul type=disc>
<li>Database access is truly random and the database is too large
to fit into the cache, resulting in a single I/O per requested key/data
pair.
<li>Both the database and the log are on a single disk.
</ul>
<p>
This means that for each transaction, Berkeley DB is potentially performing
several filesystem operations:
<ul type=disc>
<li>Disk seek to database file.
<li>Database file read.
<li>Disk seek to log file.
<li>Log file write.
<li>Flush log file information to disk.
<li>Disk seek to update log file metadata (e.g., inode).
<li>Log metadata write.
<li>Flush log file metadata to disk.
</ul>
<p>
There are a number of ways to increase transactional throughput, all
of which attempt to decrease the number of filesystem operations per
transaction:
<ul type=disc>
<li>Tune the size of the database cache. If the Berkeley DB key/data
pairs used during the transaction are found in the database cache, the
seek and read from the database are no longer necessary, resulting in two
fewer filesystem operations per transaction. To determine if your cache
size is too small, see <a href="../../ref/am/cachesize.html">Selecting a cache
size</a>.
<li>Put the database and the log files on different disks. This
allows reads and writes to the log files and the database files to be
performed concurrently.
<li>Set the filesystem configuration so that file access and
modification times are not updated. Note, although the file access
and modification times are not used by Berkeley DB, this may affect other
programs, so be careful!
<li>Upgrade your hardware. When considering the hardware on which
to run your application, however, it is important to consider the entire
system. The controller and bus can have as much to do with the disk
performance as the disk itself. It is also important to remember that
throughput is rarely the limiting factor, and that disk seek times are
normally the true performance issue for Berkeley DB.
<li>Turn on the <a href="../../api_c/DbEnv/appinit.html#DB_TXN_NOSYNC">DB_TXN_NOSYNC</a> flag. This changes the
Berkeley DB behavior so that the log files are not flushed when transactions
are committed. While this change will greatly increase your transaction
throughput, it means that transactions will exhibit the ACI (atomicity,
consistency and isolation) properties, but not D (durability), i.e.,
database integrity will be maintained but it is possible that some number
of the most recently committed transactions may be undone during recovery
instead of being redone.
</ul>
<p>
If you are bottlenecked on logging, the following test will help you
confirm that the number of transactions per
second that your application does is reasonable for the hardware on which
you're running. Your test program should repeatedly perform the following
operations:
<ul type=disc>
<li>Seek to the beginning of a file.
<li>Write to the file.
<li>Flush the file write to disk.
</ul>
<p>
The number of times that you can perform these three operations per second
is a rough measure of the number of transactions per second of which
the hardware is capable. This test simulates the operations applied
to the log file. (As a simplifying assumption in
this experiment, we assume that the database files are either on a
separate disk, or that they fit, with some few exceptions, into the
database cache.) We do not have to directly simulate updating the log file
directory information, as it will normally be updated and flushed to disk
as a result of flushing the log file write to disk.
<p>
Running this test program on reasonably standard commodity hardware
(Pentium II CPU, SCSI disk), returned the following results:
<p><ul><pre>
% testfile -b256 -o1000
running: 1000 ops
Elapsed time: 16.641934 seconds
1000 ops: 60.09 ops per second
</pre></ul><p>
<p>
Note that the number of bytes being written to the log as part of each
transaction can dramatically affect the transaction throughput. The
above test run used 256, which is a reasonable size log write. (Your
log writes may be different. To determine your average log write size,
use the <a href="../../utility/db_stat.html">db_stat</a> utility to display out your log statistics.)
<p>
As a quick sanity check, for this particular disk, the average seek time
is 9.4 msec, and the average latency is 4.17 msec. That results in a
minimum requirement for a data transfer to the disk of 13.57 msec, or a
maximum of 74 transfers per second. This is close enough to the above 60
operations per second (which wasn't done on a quiescent disk!) that the
number is believable.
<p>
The above example test program (for POSIX 1003.1 standard systems) is
available <a href="writetest.txt">here</a>.
<p>
Future releases of Berkeley DB are expected to include group commit and the
ability to stripe log files across multiple disks, both of which will
offer additional options for increasing transaction performance.
<p>
<a href="../../ref/transapp/filesys.html"><img src="../../images/prev.gif"></a>
<a href="../../ref/toc.html"><img src="../../images/toc.gif"></a>
<a href="../../ref/program/appsignals.html"><img src="../../images/next.gif"></a>
</tt>
</body>
</html>
|