1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Chapter 11. Database Configuration</title>
<link rel="stylesheet" href="gettingStarted.css" type="text/css" />
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
<link rel="start" href="index.html" title="Getting Started with Berkeley DB" />
<link rel="up" href="baseapi.html" title="Part II. Programming with the Base API" />
<link rel="prev" href="javaindexusage.html" title="Secondary Database Example" />
<link rel="next" href="cachesize.html" title="Selecting the Cache Size" />
</head>
<body>
<div xmlns="" class="navheader">
<div class="libver">
<p>Library Version 11.2.5.3</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center">Chapter 11. Database Configuration</th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a> </td>
<th width="60%" align="center">Part II. Programming with the Base API</th>
<td width="20%" align="right"> <a accesskey="n" href="cachesize.html">Next</a></td>
</tr>
</table>
<hr />
</div>
<div class="chapter" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title"><a id="dbconfig"></a>Chapter 11. Database Configuration</h2>
</div>
</div>
</div>
<div class="toc">
<p>
<b>Table of Contents</b>
</p>
<dl>
<dt>
<span class="sect1">
<a href="dbconfig.html#pagesize">Setting the Page Size</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="dbconfig.html#overflowpages">Overflow Pages</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#Locking">Locking</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#IOEfficiency">IO Efficiency</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a>
</span>
</dt>
</dl>
</dd>
<dt>
<span class="sect1">
<a href="cachesize.html">Selecting the Cache Size</a>
</span>
</dt>
<dt>
<span class="sect1">
<a href="btree.html">BTree Configuration</a>
</span>
</dt>
<dd>
<dl>
<dt>
<span class="sect2">
<a href="btree.html#duplicateRecords">Allowing Duplicate Records</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="btree.html#comparators">Setting Comparison Functions</a>
</span>
</dt>
</dl>
</dd>
</dl>
</div>
<p>
This chapter describes some of the database and cache configuration issues
that you need to consider when building your DB database.
In most cases, there is very little that you need to do in terms of
managing your databases. However, there are configuration issues that you
need to be concerned with, and these are largely dependent on the access
method that you are choosing for your database.
</p>
<p>
The examples and descriptions throughout this document have mostly focused
on the BTree access method. This is because the majority of DB
applications use BTree. For this reason, where configuration issues are
dependent on the type of access method in use, this chapter will focus on
BTree only. For configuration descriptions surrounding the other access
methods, see the <em class="citetitle">Berkeley DB Programmer's Reference Guide</em>.
</p>
<div class="sect1" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h2 class="title" style="clear: both"><a id="pagesize"></a>Setting the Page Size</h2>
</div>
</div>
</div>
<div class="toc">
<dl>
<dt>
<span class="sect2">
<a href="dbconfig.html#overflowpages">Overflow Pages</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#Locking">Locking</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#IOEfficiency">IO Efficiency</a>
</span>
</dt>
<dt>
<span class="sect2">
<a href="dbconfig.html#pagesizeAdvice">Page Sizing Advice</a>
</span>
</dt>
</dl>
</div>
<p>
Internally, DB stores database entries on pages. Page sizes are
important because they can affect your application's performance.
</p>
<p>
DB pages can be between 512 bytes and 64K bytes in size. The size
that you select must be a power of 2. You set your database's
page size using
<span><code class="methodname">DatabaseConfig.setPageSize()</code>.</span>
</p>
<p>
Note that a database's page size can only be selected at database
creation time.
</p>
<p>
When selecting a page size, you should consider the following issues:
</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>
Overflow pages.
</p>
</li>
<li>
<p>
Locking
</p>
</li>
<li>
<p>
Disk I/O.
</p>
</li>
</ul>
</div>
<p>
These topics are discussed next.
</p>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="overflowpages"></a>Overflow Pages</h3>
</div>
</div>
</div>
<p>
Overflow pages are used to hold a key or data item
that cannot fit on a single page. You do not have to do anything to
cause overflow pages to be created, other than to store data that is
too large for your database's page size. Also, the only way you can
prevent overflow pages from being created is to be sure to select a
page size that is large enough to hold your database entries.
</p>
<p>
Because overflow pages exist outside of the normal database
structure, their use is expensive from a performance
perspective. If you select too small of a page size, then your
database will be forced to use an excessive number of overflow
pages. This will significantly harm your application's performance.
</p>
<p>
For this reason, you want to select a page size that is at
least large enough to hold multiple entries given the expected
average size of your database entries. In BTree's case, for best
results select a page size that can hold at least 4 such entries.
</p>
<p>
You can see how many overflow pages your database is using by
<span>
obtaining a <code class="classname">DatabaseStats</code> object using
the <code class="methodname">Database.getStats()</code> method,
</span>
or by examining your database using the
<code class="literal">db_stat</code> command line utility.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="Locking"></a>Locking</h3>
</div>
</div>
</div>
<p>
Locking and multi-threaded access to DB databases is built into
the product. However, in order to enable the locking subsystem and
in order to provide efficient sharing of the cache between
databases, you must use an <span class="emphasis"><em>environment</em></span>.
Environments and multi-threaded access are not fully described
in this manual (see the Berkeley DB Programmer's Reference Manual for
information), however, we provide some information on sizing your
pages in a multi-threaded/multi-process environment in the interest
of providing a complete discussion on the topic.
</p>
<p>
If your application is multi-threaded, or if your databases are
accessed by more than one process at a time, then page size can
influence your application's performance. The reason why is that
for most access methods (Queue is the exception), DB implements
page-level locking. This means that the finest locking granularity
is at the page, not at the record.
</p>
<p>
In most cases, database pages contain multiple database
records. Further, in order to provide safe access to multiple
threads or processes, DB performs locking on pages as entries on
those pages are read or written.
</p>
<p>
As the size of your page increases relative to the size of your
database entries, the number of entries that are held on any given
page also increase. The result is that the chances of two or more
readers and/or writers wanting to access entries on any given page
also increases.
</p>
<p>
When two or more threads and/or processes want to manage data on a
page, lock contention occurs. Lock contention is resolved by one
thread (or process) waiting for another thread to give up its lock.
It is this waiting activity that is harmful to your application's
performance.
</p>
<p>
It is possible to select a page size that is so large that your
application will spend excessive, and noticeable, amounts of time
resolving lock contention. Note that this scenario is particularly
likely to occur as the amount of concurrency built into your
application increases.
</p>
<p>
Oh the other hand, if you select too small of a page size, then that
that will only make your tree deeper, which can also cause
performance penalties. The trick, therefore, is to select a
reasonable page size (one that will hold a sizeable number of
records) and then reduce the page size if you notice lock
contention.
</p>
<p>
You can examine the number of lock conflicts and deadlocks occurring
in your application by examining your database environment lock
statistics. Either use the
method, or use the <code class="literal">db_stat</code> command line utility.
The number of unavailable locks that your application waited for is
held in the lock statistic's <code class="literal">st_lock_wait</code> field.
</p>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="IOEfficiency"></a>IO Efficiency</h3>
</div>
</div>
</div>
<p>
Page size can affect how efficient DB is at moving data to and
from disk. For some applications, especially those for which the
in-memory cache can not be large enough to hold the entire working
dataset, IO efficiency can significantly impact application performance.
</p>
<p>
Most operating systems use an internal block size to determine how much
data to move to and from disk for a single I/O operation. This block
size is usually equal to the filesystem's block size. For optimal
disk I/O efficiency, you should select a database page size that is
equal to the operating system's I/O block size.
</p>
<p>
Essentially, DB performs data transfers based on the database
page size. That is, it moves data to and from disk a page at a time.
For this reason, if the page size does not match the I/O block size,
then the operating system can introduce inefficiencies in how it
responds to DB's I/O requests.
</p>
<p>
For example, suppose your page size is smaller than your operating
system block size. In this case, when DB writes a page to disk
it is writing just a portion of a logical filesystem page. Any time
any application writes just a portion of a logical filesystem page, the
operating system brings in the real filesystem page, over writes
the portion of the page not written by the application, then writes
the filesystem page back to disk. The net result is significantly
more disk I/O than if the application had simply selected a page
size that was equal to the underlying filesystem block size.
</p>
<p>
Alternatively, if you select a page size that is larger than the
underlying filesystem block size, then the operating system may have
to read more data than is necessary to fulfill a read request.
Further, on some operating systems, requesting a single database
page may result in the operating system reading enough filesystem
blocks to satisfy the operating system's criteria for read-ahead. In
this case, the operating system will be reading significantly more
data from disk than is actually required to fulfill DB's read
request.
</p>
<div class="note" style="margin-left: 0.5in; margin-right: 0.5in;">
<h3 class="title">Note</h3>
<p>
While transactions are not discussed in this manual, a page size
other than your filesystem's block size can affect transactional
guarantees. The reason why is that page sizes larger than the
filesystem's block size causes DB to write pages in block
size increments. As a result, it is possible for a partial page
to be written as the result of a transactional commit. For more
information, see <a class="ulink" href="http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/transapp_reclimit.html" target="_top">http://download.oracle.com/docs/cd/E17076_02/html/programmer_reference/transapp_reclimit.html</a>.
</p>
</div>
</div>
<div class="sect2" lang="en" xml:lang="en">
<div class="titlepage">
<div>
<div>
<h3 class="title"><a id="pagesizeAdvice"></a>Page Sizing Advice</h3>
</div>
</div>
</div>
<p>
Page sizing can be confusing at first, so here are some general
guidelines that you can use to select your page size.
</p>
<p>
In general, and given no other considerations, a page size that is equal
to your filesystem block size is the ideal situation.
</p>
<p>
If your data is designed such that 4 database entries cannot fit on a
single page (assuming BTree), then grow your page size to accommodate
your data. Once you've abandoned matching your filesystem's block
size, the general rule is that larger page sizes are better.
</p>
<p>
The exception to this rule is if you have a great deal of
concurrency occurring in your application. In this case, the closer
you can match your page size to the ideal size needed for your
application's data, the better. Doing so will allow you to avoid
unnecessary contention for page locks.
</p>
</div>
</div>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href="javaindexusage.html">Prev</a> </td>
<td width="20%" align="center">
<a accesskey="u" href="baseapi.html">Up</a>
</td>
<td width="40%" align="right"> <a accesskey="n" href="cachesize.html">Next</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Secondary Database Example </td>
<td width="20%" align="center">
<a accesskey="h" href="index.html">Home</a>
</td>
<td width="40%" align="right" valign="top"> Selecting the Cache Size</td>
</tr>
</table>
</div>
</body>
</html>
|