1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223
|
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Cached Checks</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.75.1">
<meta name="keywords" content="Supervision, Icinga, Nagios, Linux">
<link rel="home" href="index.html" title="Icinga Version 1.0.2 Documentation">
<link rel="up" href="ch06.html" title="Chapter 6. Advanced Topics">
<link rel="prev" href="dependencychecks.html" title="Predictive Dependency Checks">
<link rel="next" href="passivestatetranslation.html" title="Passive Host State Translation">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<CENTER><IMG src="../images/logofullsize.png" border="0" alt="Icinga" title="Icinga"></CENTER>
<div class="navheader">
<table width="100%" summary="Navigation header">
<tr><th colspan="3" align="center">Cached Checks</th></tr>
<tr>
<td width="20%" align="left">
<a accesskey="p" href="dependencychecks.html">Prev</a> </td>
<th width="60%" align="center">Chapter 6. Advanced Topics</th>
<td width="20%" align="right"> <a accesskey="n" href="passivestatetranslation.html">Next</a>
</td>
</tr>
</table>
<hr>
</div>
<div class="section" title="Cached Checks">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="cachedchecks"></a><a name="cached_checks"></a>Cached Checks</h2></div></div></div>
<p><span class="bold"><strong>Introduction</strong></span></p>
<p><span class="inlinemediaobject"><img src="../images/cachedchecks1.png"></span></p>
<p>The performance of Icinga' monitoring logic can be significantly improved by implementing the use of cached checks.
Cached checks allow Icinga to forgo executing a host or service check command if it determines a relatively recent check
result will do instead.</p>
<p><span class="bold"><strong>For On-Demand Checks Only</strong></span></p>
<p>Regularly scheduled host and service checks will not see a performance improvement with use of cached checks. Cached
checks are only useful for improving the performance of on-demand host and service checks. Scheduled checks help to ensure that
host and service states are updated regularly, which may result in a greater possibility their results can be used as cached
checks in the future.</p>
<p>For reference, on-demand host checks occur...</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p>When a service associated with the host changes state.</p>
</li>
<li class="listitem">
<p>As needed as part of the <a class="link" href="networkreachability.html" title="Determining Status and Reachability of Network Hosts">host reachability</a> logic.</p>
</li>
<li class="listitem">
<p>As needed for <a class="link" href="dependencychecks.html" title="Predictive Dependency Checks">predictive host dependency checks</a>.</p>
</li>
</ul></div>
<p>And on-demand service checks occur...</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<p>As needed for <a class="link" href="dependencychecks.html" title="Predictive Dependency Checks">predictive service dependency checks</a>.</p>
</li></ul></div>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top">
<p>Unless you make use of service dependencies, Icinga will not be able to use cached check results to improve the
performance of service checks. Don't worry about that - its normal. Cached host checks are where the big performance
improvements lie, and everyone should see a benefit there.</p>
</td></tr>
</table></div>
<p><span class="bold"><strong>How Caching Works</strong></span></p>
<p><span class="inlinemediaobject"><img src="../images/cachedchecks.png"></span></p>
<p>When Icinga needs to perform an on-demand host or service check, it will make a determination as to whether it can
used a cached check result or if it needs to perform an actual check by executing a plugin. It does this by checking to see if
the last check of the host or service occured within the last X minutes, where X is the cached host or service check
horizon.</p>
<p>If the last check was performed within the timeframe specified by the cached check horizon variable, Icinga will
use the result of the last host or service check and will <span class="emphasis"><em>not</em></span> execute a new check. If the host or service
has not yet been checked, or if the last check falls outside of the cached check horizon timeframe, Icinga will execute a
new host or service check by running a plugin.</p>
<p><span class="bold"><strong>What This Really Means</strong></span></p>
<p>Icinga performs on-demand checks because it need to know the current state of a host or service <span class="emphasis"><em>at that
exact moment</em></span> in time. Utilizing cached checks allows you to make Icinga think that recent check results are
"good enough" for determining the current state of hosts, and that it doesn't need to go out and actually re-check the status of
that host or service.</p>
<p>The cached check horizon tells Icinga how recent check results must be in order to reliably reflect the current
state of a host or service. For example, with a cached check horizon of 30 seconds, you are telling Icinga that if a
host's state was checked sometime in the last 30 seconds, the result of that check should still be considered the current state
of the host.</p>
<p>The number of cached check results that Icinga can use versus the number of on-demand checks it has to actually
execute can be considered the cached check "hit" rate. By increasing the cached check horizon to equal the regular check
interval of a host, you could theoretically achieve a cache hit rate of 100%. In that case all on-demand checks of that host
would use cached check results. What a performance improvement! But is it really? Probably not.</p>
<p>The reliability of cached check result information decreases over time. Higher cache hit rates require that previous check
results are considered "valid" for longer periods of time. Things can change quickly in any network scenario, and there's no
guarantee that a server that was functioning properly 30 seconds ago isn't on fire right now. There's the tradeoff - reliability
versus speed. If you have a large cached check horizon, you risk having unreliable check result values being used in the
monitoring logic.</p>
<p>Icinga will eventually determine the correct state of all hosts and services, so even if cached check results prove
to unreliably represent their true value, Icinga will only work with incorrect information for a short period of time.
Even short periods of unreliable status information can prove to be a nuisance for admins, as they may receive notifications
about problems which no longer exist.</p>
<p>There is no standard cached check horizon or cache hit rate that will be acceptable to every Icinga users. Some
people will want a short horizon timeframe and a low cache hit rate, while others will want a larger horizon timeframe and a
larger cache hit rate (with a low reliability rate). Some users may even want to disable cached checks altogether to obtain a
100% reliability rate. Testing different horizon timeframes, and their effect on the reliability of status information, is the
only want that an individual user will find the "right" value for their situation. More information on this is discussed
below.</p>
<p><span class="bold"><strong>Configuration Variables</strong></span></p>
<p>The following variables determine the timeframes in which a previous host or service check result may be used as a cached
host or service check result:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p>The <a class="link" href="configmain.html#configmain-cached_host_check_horizon">cached_host_check_horizon</a> variable controls cached
host checks.</p>
</li>
<li class="listitem">
<p>The <a class="link" href="configmain.html#configmain-cached_service_check_horizon">cached_service_check_horizon</a> variable controls
cached service checks.</p>
</li>
</ul></div>
<p><span class="bold"><strong>Optimizing Cache Effectiveness</strong></span></p>
<p>In order to make the most effective use of cached checks, you should:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p>Schedule regular checks of your hosts</p>
</li>
<li class="listitem">
<p>Use MRTG to graph statistics for 1) on-demand checks and 2) cached checks</p>
</li>
<li class="listitem">
<p>Adjust cached check horizon variables to fit your needs</p>
</li>
</ul></div>
<p>You can schedule regular checks of your hosts by specifying a value greater than 0 for <span class="emphasis"><em>check_interval</em></span>
option in your <a class="link" href="objectdefinitions.html#objectdefinitions-host">host definitions</a>. If you do this, make sure that you set the
<span class="emphasis"><em>max_check_attempts</em></span> option to a value greater than 1, or it will cause a big performance hit. This potential
performance hit is describe in detail <a class="link" href="hostchecks.html" title="Host Checks">here</a>.</p>
<p><span class="inlinemediaobject"><img src="../images/cachedcheckgraphs.png"></span></p>
<p>A good way to determine the proper value for the cached check horizon options is to compare how many on-demand checks
Icinga has to actually run versus how may it can use cached values for. The <a class="link" href="icingastats.html" title="Using The Icingastats Utility">icingastats</a> utility can produce information on cached checks, which can then be <a class="link" href="mrtggraphs.html" title="Graphing Performance Info With MRTG">graphed with MRTG</a>. Example MRTG graphs that show cached vs. actual on-demand checks are shown to the
right.</p>
<p>The monitoring installation which produced the graphs above had:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p>A total of 44 hosts, all of which were checked at regular intervals</p>
</li>
<li class="listitem">
<p>An average (regularly scheduled) host check interval of 5 minutes</p>
</li>
<li class="listitem">
<p>A <a class="link" href="configmain.html#configmain-cached_host_check_horizon">cached_host_check_horizon</a> of 15 seconds</p>
</li>
</ul></div>
<p>The first MRTG graph shows how many regularly scheduled host checks compared to how many cached host checks have occured.
In this example, an average of 53 host checks occur every five minutes. 9 of these (17%) are on-demand checks.</p>
<p>The second MRTG graph shows how many cached host checks have occurred over time. In this example an average of 2 cached
host checks occurs every five minutes.</p>
<p>Remember, cached checks are only available for on-demand checks. Based on the 5 minute averages from the graphs, we see
that Icinga is able to used cached host check results every 2 out of 9 times an on-demand check has to be run. That may
not seem much, but these graphs represent a small monitoring environment. Consider that 2 out of 9 is 22% and you can start to
see how this could significantly help improve host check performance in large environments. That percentage could be higher if
the cached host check horizon variable value was increased, but that would reduce the reliability of the cached host state
information.</p>
<p>Once you've had a few hours or days worth of MRTG graphs, you should see how many host and service checks were done by
executing plugins versus those that used cached check results. Use that information to adjust the cached check horizon variables
appropriately for your situation. Continue to monitor the MRTG graphs over time to see how changing the horizon variables
affected cached check statistics. Rinse and repeat as necessary.</p>
<a class="indexterm" name="id1999427"></a>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="dependencychecks.html">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="ch06.html">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="passivestatetranslation.html">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Predictive Dependency Checks </td>
<td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td>
<td width="40%" align="right" valign="top"> Passive Host State Translation</td>
</tr>
</table>
</div>
<P class="copyright">© 2009-2010 Icinga Development Team, http://www.icinga.org</P>
</body>
</html>
|