File: wtperf.html

package info (click to toggle)
wiredtiger 3.2.1-1
  • links: PTS
  • area: main
  • in suites: bookworm, bullseye, forky, sid, trixie
  • size: 25,456 kB
  • sloc: ansic: 102,922; python: 52,573; sh: 6,915; java: 6,130; cpp: 2,311; makefile: 1,018; xml: 176
file content (160 lines) | stat: -rw-r--r-- 19,342 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<title>WiredTiger: Simulating workloads with wtperf</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
  $(document).ready(initResizable);
</script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="wiredtiger.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
 <tbody>
 <tr style="height: 56px;">
  <td id="projectlogo"><a href="http://wiredtiger.com/"><img alt="Logo" src="LogoFinal-header.png" alt="WiredTiger" /></a></td>
  <td style="padding-left: 0.5em;">
   <div id="projectname">
   &#160;<span id="projectnumber">Version 3.2.1</span>
   </div>
   <div id="projectbrief"><!-- 3.2.1 --></div>
  </td>
 </tr>
 </tbody>
</table>
</div>
<div class="banner">
  <a href="https://github.com/wiredtiger/wiredtiger">Fork me on GitHub</a>
  <a class="last" href="http://groups.google.com/group/wiredtiger-users">Join my user group</a>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.13 -->
<script type="text/javascript" src="menudata.js"></script>
<script type="text/javascript" src="menu.js"></script>
<script type="text/javascript">
$(function() {
  initMenu('',false,false,'search.php','Search');
});
</script>
<div id="main-nav"></div>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
  <div id="nav-tree">
    <div id="nav-tree-contents">
      <div id="nav-sync" class="sync"></div>
    </div>
  </div>
  <div id="splitbar" style="-moz-user-select:none;" 
       class="ui-resizable-handle">
  </div>
</div>
<script type="text/javascript">
$(document).ready(function(){initNavTree('wtperf.html','');});
</script>
<div id="doc-content">
<div class="header">
  <div class="headertitle">
<div class="title">Simulating workloads with wtperf </div>  </div>
</div><!--header-->
<div class="contents">
<div class="textblock"><p>The WiredTiger distribution includes a tool that can be used to simulate workloads in WiredTiger, in the directory <code>bench/wtperf</code>.</p>
<p>The <code>wtperf</code> utility generally has two phases, the populate phase which creates a database and then populates an object in that database, and a workload phase, that does some set of operations on the object.</p>
<p>For example, the following configuration uses a single thread to populate a file object with 500,000 records in a 500MB cache. The workload phase consists of 8 threads running for two minutes, all reading from the file.</p>
<div class="fragment"><div class="line">conn_config=<span class="stringliteral">&quot;cache_size=500MB&quot;</span></div><div class="line">table_config=<span class="stringliteral">&quot;type=file&quot;</span></div><div class="line">icount=500000</div><div class="line">run_time=120</div><div class="line">populate_threads=1</div><div class="line">threads=((count=8,reads=1))</div></div><!-- fragment --><p>In most cases, where the workload is the only interesting phase, the populate phase can be performed once and the workload phase run repeatedly (for more information, see the wtperf <code>create</code> configuration variable).</p>
<p>The <code>conn_config</code> configuration supports setting any WiredTiger connection configuration value. This is commonly used to configure statistics with regular reports, to obtain more information from the run:</p>
<div class="fragment"><div class="line">conn_config=<span class="stringliteral">&quot;cache_size=20G,statistics=(fast,clear),statistics_log=(wait=600)&quot;</span></div><div class="line">report_interval=5</div></div><!-- fragment --><p>Note quoting must be used when passing values to Wiredtiger configuration, as opposed to configuring the <code>wtperf</code> utility itself.</p>
<p>The <code>table_config</code> configuration supports setting any WiredTiger object creation configuration value, for example, the above test can be converted to using an LSM store instead of a B+tree store, with additional LSM configuration, by changing <code>conn_config</code> to:</p>
<div class="fragment"><div class="line">table_config=<span class="stringliteral">&quot;lsm=(chunk_size=5MB),type=lsm,os_cache_dirty_max=16MB&quot;</span></div></div><!-- fragment --><p>More complex workloads can be configured by creating more threads doing inserts and updates as well as reads. For example, to configure two inserting threads two threads doing a mixture of inserts, reads and updates:</p>
<div class="fragment"><div class="line">threads=((count=2,inserts=1),(count=2,inserts=1,reads=1,updates=1))</div></div><!-- fragment --><p>Example <code>wtperf</code> configuration files can be found in the <code>bench/wtperf/runners/</code> directory.</p>
<p>There are also a number of command line arguments that can be passed to <code>wtperf:</code> </p><dl class="section user"><dt>-C config</dt><dd>Specify configuration strings for the <a class="el" href="group__wt.html#gacbe8d118f978f5bfc8ccb4c77c9e8813" title="Open a connection to a database. ">wiredtiger_open</a> function. This argument is additive to the <code>conn_config</code> parameter in the configuration file. </dd></dl>
<dl class="section user"><dt>-h directory</dt><dd>Specify a database home directory. The default is <code></code>./WT_TEST. </dd></dl>
<dl class="section user"><dt>-m monitor_directory</dt><dd>Specify a directory for all monitoring related files. The default is the database home directory. </dd></dl>
<dl class="section user"><dt>-O config_file</dt><dd>Specify the configuration file to run. </dd></dl>
<dl class="section user"><dt>-o config</dt><dd>Specify configuration strings for the <code>wtperf</code> program. This argument will override settings in the configuration file. </dd></dl>
<dl class="section user"><dt>-T config</dt><dd>Specify configuration strings for the <a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb" title="Create a table, column group, index or file. ">WT_SESSION::create</a> function. This argument is additive to the <code>table_config</code> parameter in the configuration file.</dd></dl>
<h1><a class="anchor" id="monitor"></a>
Monitoring wtperf</h1>
<p>Like all WiredTiger applications, the <code>wtperf</code> command can be configured with statistics logging.</p>
<p>In addition to statistics logging, <code>wtperf</code> can monitor performance and operation latency times. Monitoring is enabled using the <code>sample_interval</code> configuration. For example to record information every 10 seconds, set the following on the command line or add it to the <code>wtperf</code> configuration file:</p>
<div class="fragment"><div class="line">sample_interval=10</div></div><!-- fragment --><p>Enabling monitoring causes <code>wtperf</code> to create a file <code>monitor</code> in the database home directory (or another directory as specified using the <code>-m</code> option to <code>wtperf</code>).</p>
<p>The following example shows how to run the <code>medium-btree.wtperf</code> configuration with monitoring enabled, and then generate a graph.</p>
<div class="fragment"><div class="line"><span class="preprocessor"># Change into the WiredTiger directory.</span></div><div class="line">cd wiredtiger</div><div class="line"></div><div class="line"><span class="preprocessor"># Configure and build WiredTiger if not already built.</span></div><div class="line">./configure &amp;&amp; make</div><div class="line"></div><div class="line"><span class="preprocessor"># Remove and re-create the run directory.</span></div><div class="line">rm -rf WTPERF_RUN &amp;&amp; mkdir WTPERF_RUN</div><div class="line"></div><div class="line"><span class="preprocessor"># Run the medium-btree.wtperf workload, sampling performance every 5 seconds.</span></div><div class="line">bench/wtperf/wtperf \</div><div class="line">    -h WTPERF_RUN \</div><div class="line">    -o sample_interval=5 \</div><div class="line">    -O bench/wtperf/runners/medium-btree.wtperf</div></div><!-- fragment --><h1><a class="anchor" id="config"></a>
Wtperf configuration options</h1>
<p>The following is a list of the currently available <code>wtperf</code> configuration options:</p>
<dl class="section user"><dt>async_threads (unsigned int, default=0)</dt><dd>number of async worker threads </dd></dl>
<dl class="section user"><dt>checkpoint_interval (unsigned int, default=120)</dt><dd>checkpoint every interval seconds during the workload phase. </dd></dl>
<dl class="section user"><dt>checkpoint_stress_rate (unsigned int, default=0)</dt><dd>checkpoint every rate operations during the populate phase in the populate thread(s), 0 to disable </dd></dl>
<dl class="section user"><dt>checkpoint_threads (unsigned int, default=0)</dt><dd>number of checkpoint threads </dd></dl>
<dl class="section user"><dt>conn_config (string, default="create,statistics=(fast),statistics_log=(json,wait=1)")</dt><dd>connection configuration string </dd></dl>
<dl class="section user"><dt>close_conn (boolean, default=true)</dt><dd>properly close connection at end of test. Setting to false does not sync data to disk and can result in lost data after test exits. </dd></dl>
<dl class="section user"><dt>compact (boolean, default=false)</dt><dd>post-populate compact for LSM merging activity </dd></dl>
<dl class="section user"><dt>compression (string, default="none")</dt><dd>compression extension. Allowed configuration values are: 'none', 'lz4', 'snappy', 'zlib', 'zstd' </dd></dl>
<dl class="section user"><dt>create (boolean, default=true)</dt><dd>do population phase; false to use existing database </dd></dl>
<dl class="section user"><dt>database_count (unsigned int, default=1)</dt><dd>number of WiredTiger databases to use. Each database will execute the workload using a separate home directory and complete set of worker threads </dd></dl>
<dl class="section user"><dt>drop_tables (boolean, default=false)</dt><dd>Whether to drop all tables at the end of the run, and report time taken to do the drop. </dd></dl>
<dl class="section user"><dt>in_memory (boolean, default=false)</dt><dd>Whether to create the database in-memory. </dd></dl>
<dl class="section user"><dt>icount (unsigned int, default=5000)</dt><dd>number of records to initially populate. If multiple tables are configured the count is spread evenly across all tables. </dd></dl>
<dl class="section user"><dt>idle_table_cycle (unsigned int, default=0)</dt><dd>Enable regular create and drop of idle tables, value is the maximum number of seconds a create or drop is allowed before flagging an error. Default 0 which means disabled. </dd></dl>
<dl class="section user"><dt>index (boolean, default=false)</dt><dd>Whether to create an index on the value field. </dd></dl>
<dl class="section user"><dt>insert_rmw (boolean, default=false)</dt><dd>execute a read prior to each insert in workload phase </dd></dl>
<dl class="section user"><dt>key_sz (unsigned int, default=20)</dt><dd>key size </dd></dl>
<dl class="section user"><dt>log_partial (boolean, default=false)</dt><dd>perform partial logging on first table only. </dd></dl>
<dl class="section user"><dt>log_like_table (boolean, default=false)</dt><dd>Append all modification operations to another shared table. </dd></dl>
<dl class="section user"><dt>min_throughput (unsigned int, default=0)</dt><dd>notify if any throughput measured is less than this amount. Aborts or prints warning based on min_throughput_fatal setting. Requires sample_interval to be configured </dd></dl>
<dl class="section user"><dt>min_throughput_fatal (boolean, default=false)</dt><dd>print warning (false) or abort (true) of min_throughput failure. </dd></dl>
<dl class="section user"><dt>max_latency (unsigned int, default=0)</dt><dd>notify if any latency measured exceeds this number of milliseconds. Aborts or prints warning based on min_throughput_fatal setting. Requires sample_interval to be configured </dd></dl>
<dl class="section user"><dt>max_latency_fatal (boolean, default=false)</dt><dd>print warning (false) or abort (true) of max_latency failure. </dd></dl>
<dl class="section user"><dt>pareto (unsigned int, default=0)</dt><dd>use pareto distribution for random numbers. Zero to disable, otherwise a percentage indicating how aggressive the distribution should be. </dd></dl>
<dl class="section user"><dt>populate_ops_per_txn (unsigned int, default=0)</dt><dd>number of operations to group into each transaction in the populate phase, zero for auto-commit </dd></dl>
<dl class="section user"><dt>populate_threads (unsigned int, default=1)</dt><dd>number of populate threads, 1 for bulk load </dd></dl>
<dl class="section user"><dt>pre_load_data (boolean, default=false)</dt><dd>Scan all data prior to starting the workload phase to warm the cache </dd></dl>
<dl class="section user"><dt>random_range (unsigned int, default=0)</dt><dd>if non zero choose a value from within this range as the key for insert operations </dd></dl>
<dl class="section user"><dt>random_value (boolean, default=false)</dt><dd>generate random content for the value </dd></dl>
<dl class="section user"><dt>range_partition (boolean, default=false)</dt><dd>partition data by range (vs hash) </dd></dl>
<dl class="section user"><dt>readonly (boolean, default=false)</dt><dd>reopen the connection between populate and workload phases in readonly mode. Requires reopen_connection turned on (default). Requires that read be the only workload specified </dd></dl>
<dl class="section user"><dt>reopen_connection (boolean, default=true)</dt><dd>close and reopen the connection between populate and workload phases </dd></dl>
<dl class="section user"><dt>report_interval (unsigned int, default=2)</dt><dd>output throughput information every interval seconds, 0 to disable </dd></dl>
<dl class="section user"><dt>run_ops (unsigned int, default=0)</dt><dd>total read, insert and update workload operations </dd></dl>
<dl class="section user"><dt>run_time (unsigned int, default=0)</dt><dd>total workload seconds </dd></dl>
<dl class="section user"><dt>sample_interval (unsigned int, default=0)</dt><dd>performance logging every interval seconds, 0 to disable </dd></dl>
<dl class="section user"><dt>sample_rate (unsigned int, default=50)</dt><dd>how often the latency of operations is measured. One for every operation, two for every second operation, three for every third operation etc. </dd></dl>
<dl class="section user"><dt>scan_icount (unsigned int, default=0)</dt><dd>number of records in scan tables to populate </dd></dl>
<dl class="section user"><dt>scan_interval (unsigned int, default=0)</dt><dd>scan tables every interval seconds during the workload phase, 0 to disable </dd></dl>
<dl class="section user"><dt>scan_pct (unsigned int, default=10)</dt><dd>percentage of entire data set scanned, if scan_interval is enabled </dd></dl>
<dl class="section user"><dt>scan_table_count (unsigned int, default=0)</dt><dd>number of separate tables to be used for scanning. Zero indicates that tables are shared with other operations </dd></dl>
<dl class="section user"><dt>sess_config (string, default="")</dt><dd>session configuration string </dd></dl>
<dl class="section user"><dt>session_count_idle (unsigned int, default=0)</dt><dd>number of idle sessions to create. Default 0. </dd></dl>
<dl class="section user"><dt>table_config (string, default="key_format=S,value_format=S,type=lsm,exclusive=true, allocation_size=4kb,internal_page_max=64kb,leaf_page_max=4kb, split_pct=100")</dt><dd>table configuration string </dd></dl>
<dl class="section user"><dt>table_count (unsigned int, default=1)</dt><dd>number of tables to run operations over. Keys are divided evenly over the tables. Cursors are held open on all tables. Default 1, maximum 99999. </dd></dl>
<dl class="section user"><dt>table_count_idle (unsigned int, default=0)</dt><dd>number of tables to create, that won't be populated. Default 0. </dd></dl>
<dl class="section user"><dt>threads (string, default="")</dt><dd>workload configuration: each 'count' entry is the total number of threads, and the 'insert', 'read' and 'update' entries are the ratios of insert, read and update operations done by each worker thread; If a throttle value is provided each thread will do a maximum of that number of operations per second; multiple workload configurations may be specified per threads configuration; for example, a more complex threads configuration might be 'threads=((count=2,reads=1)(count=8,reads=1,inserts=2,updates=1))' which would create 2 threads doing nothing but reads and 8 threads each doing 50% inserts and 25% reads and updates. Allowed configuration values are 'count', 'throttle', 'update_delta', 'reads', 'read_range', 'inserts', 'updates', 'truncate', 'truncate_pct' and 'truncate_count'. There are also behavior modifiers, supported modifiers are 'ops_per_txn' </dd></dl>
<dl class="section user"><dt>transaction_config (string, default="")</dt><dd><a class="el" href="struct_w_t___s_e_s_s_i_o_n.html#a7e26b16b26b5870498752322fad790bf" title="Start a transaction in this session. ">WT_SESSION.begin_transaction</a> configuration string, applied during the populate phase when populate_ops_per_txn is nonzero </dd></dl>
<dl class="section user"><dt>table_name (string, default="test")</dt><dd>table name </dd></dl>
<dl class="section user"><dt>truncate_single_ops (boolean, default=false)</dt><dd>Implement truncate via cursor remove instead of session API </dd></dl>
<dl class="section user"><dt>value_sz_max (unsigned int, default=1000)</dt><dd>maximum value size when delta updates are present. Default disabled </dd></dl>
<dl class="section user"><dt>value_sz_min (unsigned int, default=1)</dt><dd>minimum value size when delta updates are present. Default disabled </dd></dl>
<dl class="section user"><dt>value_sz (unsigned int, default=100)</dt><dd>value size </dd></dl>
<dl class="section user"><dt>verbose (unsigned int, default=1)</dt><dd>verbosity </dd></dl>
<dl class="section user"><dt>warmup (unsigned int, default=0)</dt><dd>How long to run the workload phase before starting measurements </dd></dl>
</div></div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
  <ul>
    <li class="navelem"><a class="el" href="index.html">Reference Guide</a></li><li class="navelem"><a class="el" href="programming_lang_java.html">Writing WiredTiger applications  in Java</a></li>
    <li class="footer">Copyright (c) 2008-2019 MongoDB, Inc.  All rights reserved.  Contact <a href="mailto:info@wiredtiger.com">info@wiredtiger.com</a> for more information.</li>
  </ul>
</div>
</body>
</html>