1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493
|
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>7.23. Service and Host Check Scheduling</title>
<link rel="stylesheet" href="../stylesheets/icinga-docs.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.75.1">
<meta name="keywords" content="Supervision, Icinga, Nagios, Linux">
<link rel="home" href="index.html" title="Icinga Version 1.14 Documentation">
<link rel="up" href="ch07.html" title="Chapter 7. Advanced Topics">
<link rel="prev" href="passivestatetranslation.html" title="7.22. Passive Host State Translation">
<link rel="next" href="cgiincludes.html" title="7.24. Custom CGI Headers and Footers (Classic UI)">
<script src="../js/jquery-min.js" type="text/javascript"></script><script src="../js/icinga-docs.js" type="text/javascript"></script>
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<CENTER><IMG src="../images/logofullsize.png" border="0" alt="Icinga" title="Icinga"></CENTER>
<div class="navheader">
<table width="100%" summary="Navigation header">
<tr><th colspan="3" align="center">7.23. Service and Host Check Scheduling</th></tr>
<tr>
<td width="20%" align="left">
<a accesskey="p" href="passivestatetranslation.html">Prev</a> </td>
<th width="60%" align="center">Chapter 7. Advanced Topics</th>
<td width="20%" align="right"> <a accesskey="n" href="cgiincludes.html">Next</a>
</td>
</tr>
</table>
<hr>
</div>
<div class="section" title="7.23. Service and Host Check Scheduling">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="checkscheduling"></a>7.23. <a name="checkscheduling-check_scheduling"></a>Service and Host Check Scheduling</h2></div></div></div>
<div class="toc"><dl>
<dt><span class="section">7.23.1. <a href="checkscheduling.html#introduction_checkscheduling">Introduction</a></span></dt>
<dt><span class="section">7.23.2. <a href="checkscheduling.html#configurationoptions">Configuration Options</a></span></dt>
<dt><span class="section">7.23.3. <a href="checkscheduling.html#initialscheduling">Initial Scheduling</a></span></dt>
<dt><span class="section">7.23.4. <a href="checkscheduling.html#serviceintercheckdelay">Inter-Check Delay</a></span></dt>
<dt><span class="section">7.23.5. <a href="checkscheduling.html#serviceinterleaving">Service Interleaving</a></span></dt>
<dt><span class="section">7.23.6. <a href="checkscheduling.html#maxconcurrentchecks">Maximum Concurrent Service Checks</a></span></dt>
<dt><span class="section">7.23.7. <a href="checkscheduling.html#timerestraints">Time Restraints</a></span></dt>
<dt><span class="section">7.23.8. <a href="checkscheduling.html#normalscheduling">Normal Scheduling</a></span></dt>
<dt><span class="section">7.23.9. <a href="checkscheduling.html#problemscheduling">Scheduling During Problems</a></span></dt>
<dt><span class="section">7.23.10. <a href="checkscheduling.html#hostcheckscheduling">Host Checks</a></span></dt>
<dt><span class="section">7.23.11. <a href="checkscheduling.html#schedulingdelays">Scheduling Delays</a></span></dt>
<dt><span class="section">7.23.12. <a href="checkscheduling.html#schedulingexample">Scheduling Example</a></span></dt>
<dt><span class="section">7.23.13. <a href="checkscheduling.html#serviceoptions">Service Definition Options That Affect Scheduling</a></span></dt>
<dt><span class="section">7.23.14. <a href="checkscheduling.html#todo">TODO</a></span></dt>
</dl></div>
<div class="section" title="7.23.1. Introduction">
<div class="titlepage"><div><div><h3 class="title">
<a name="introduction_checkscheduling"></a>7.23.1. Introduction</h3></div></div></div>
<p>There were a lot of questions regarding how service checks are scheduled in certain situations, along with how the scheduling
differs from when the checks are actually executed and their results are processed. We'll try to go into a little more detail on how
this all works...</p>
</div>
<div class="section" title="7.23.2. Configuration Options">
<div class="titlepage"><div><div><h3 class="title">
<a name="configurationoptions"></a>7.23.2. Configuration Options</h3></div></div></div>
<p>Before we begin, there are several configuration options that affect how service checks are scheduled, executed, and processed.
For starters, each <a class="link" href="objectdefinitions.html#objectdefinitions-service">service definition</a> contains three options that determine when and
how each specific service check is scheduled and executed. Those three options are:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p><span class="emphasis"><em>check_interval</em></span></p>
</li>
<li class="listitem">
<p><span class="emphasis"><em>retry_interval</em></span></p>
</li>
<li class="listitem">
<p><span class="emphasis"><em>check_period</em></span></p>
</li>
</ul></div>
<p>There are also four configuration options in the <a class="link" href="configmain.html" title="3.2. Main Configuration File Options">main configuration file</a> that affect service
checks. These include:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p><a class="link" href="configmain.html#configmain-service_inter_check_delay_method"> <span class="emphasis"><em>service_inter_check_delay_method</em></span>
</a></p>
</li>
<li class="listitem">
<p><a class="link" href="configmain.html#configmain-service_interleave_factor"> <span class="emphasis"><em>service_interleave_factor</em></span> </a></p>
</li>
<li class="listitem">
<p><a class="link" href="configmain.html#configmain-max_concurrent_checks"> <span class="emphasis"><em>max_concurrent_checks</em></span> </a></p>
</li>
<li class="listitem">
<p><a class="link" href="configmain.html#configmain-check_result_reaper_frequency"> <span class="emphasis"><em>check_result_reaper_frequency</em></span> </a></p>
</li>
</ul></div>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top">
<p>The last directive affects host checks as well.</p>
</td></tr>
</table></div>
<p>We'll go into more detail on how all these options affect service check scheduling as we progress. First off, let's see how
services are initially scheduled when Icinga first starts or restarts...</p>
</div>
<div class="section" title="7.23.3. Initial Scheduling">
<div class="titlepage"><div><div><h3 class="title">
<a name="initialscheduling"></a>7.23.3. Initial Scheduling</h3></div></div></div>
<p>When Icinga (re)starts, it will attempt to schedule the initial check of all services in a manner that will minimize the
load imposed on the local and remote hosts. This is done by spacing the initial service checks out, as well as interleaving them. The
spacing of service checks (also known as the inter-check delay) is used to minimize/equalize the load on the local host running
Icinga and the interleaving is used to minimize/equalize load imposed on remote hosts. Both the inter-check delay and interleave
functions are discussed below.</p>
<p>Even though service checks are initially scheduled to balance the load on both the local and remote hosts, things will eventually
give in to the ensuing chaos and be a bit random. Reasons for this include the fact that services are not all checked at the same
interval, some services take longer to execute than others, host and/or service problems can alter the timing of one or more service
checks, etc. At least we try to get things off to a good start. Hopefully the initial scheduling will keep the load on the local and
remote hosts fairly balanced as time goes by...</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top">
<p>If you want to view the initial service check scheduling information, start Icinga using the <span class="bold"><strong>-s</strong></span> command line option. Doing so will display basic scheduling information (inter-check delay, interleave
factor, first and last service check time, etc) and will create a new status log that shows the exact time that all services are
initially scheduled. Because this option will overwrite the status log, you should not use it when another copy of Icinga is
running. Icinga does <span class="emphasis"><em>not</em></span> start monitoring anything when this argument is used.</p>
</td></tr>
</table></div>
</div>
<div class="section" title="7.23.4. Inter-Check Delay">
<div class="titlepage"><div><div><h3 class="title">
<a name="serviceintercheckdelay"></a>7.23.4. Inter-Check Delay</h3></div></div></div>
<p>As mentioned before, Icinga attempts to equalize the load placed on the machine that is running Icinga by equally
spacing out initial service checks. The spacing between consecutive service checks is called the inter-check delay. By giving a value to
the <a class="link" href="configmain.html#configmain-service_inter_check_delay_method">service_inter_check_delay_method</a> variable in the main config
file, you can modify how this delay is calculated. We will discuss how the "smart" calculation works, as this is the setting you will
want to use for normal operation.</p>
<p>When using the "smart" setting of the <span class="emphasis"><em>service_inter_check_delay_method</em></span> variable, Icinga will calculate
an inter-check delay value by using the following calculation:</p>
<p><span class="emphasis"><em>inter-check delay = (average check interval for all services) / (total number of services)</em></span></p>
<p>Let's take an example. Say you have 1,000 services that each have a normal check interval of 5 minutes (obviously some services
are going to be checked at different intervals, but let's look at an easy case...). The total check interal time for all services is
5,000 (1,000 * 5). That means that the average check interval for each service is 5 minutes (5,000 / 1,000). Give that information, we
realize that (on average) we need to re-check 1,000 services every 5 minutes. This means that we should use an inter-check delay of
0.005 minutes (0.3 seconds) when spacing out the initial service checks. By spacing each service check out by 0.3 seconds, we can
somewhat guarantee that Icinga is scheduling and/or executing 3 new service checks every second. By spacing the checks out evenly
over time like this, we can hope that the load on the local server that is running Icinga remains somewhat balanced.</p>
</div>
<div class="section" title="7.23.5. Service Interleaving">
<div class="titlepage"><div><div><h3 class="title">
<a name="serviceinterleaving"></a>7.23.5. Service Interleaving</h3></div></div></div>
<p>As discussed above, the inter-check delay helps to equalize the load that Icinga imposes on the local host. What about
remote hosts? Is it necessary to equalize load on remote hosts? Why? Yes, it is important and yes, Icinga can help out with this.
If you monitor a large number of services on a remote host and the checks were not spread out, the remote host might think that it was
the victim of a SYN attack if there were a lot of open connections on the same port. Plus, attempting to equalize the load on hosts is
just a nice thing to do...</p>
<p>By giving a value to the <a class="link" href="configmain.html#configmain-service_interleave_factor">service_interleave_factor</a> variable in the
main config file, you can modify how the interleave factor is calculated. We will discuss how the "smart" calculation works, as this
will probably be the setting you will want to use for normal operation. You can, however, use a pre-set interleave factor instead of
having Icinga calculate one for you. Also of note, if you use an interleave factor of 1, service check interleaving is basically
disabled.</p>
<p>When using the "smart" setting of the <span class="emphasis"><em>service_interleave_factor</em></span> variable, Icinga will calculate an
interleave factor by using the following calculation:</p>
<p><span class="emphasis"><em>interleave factor = ceil ( total number of services / total number of hosts )</em></span></p>
<p>Let's take an example. Say you have a total of 1,000 services and 150 hosts that you monitor. Icinga would calculate the
interleave factor to be 7. This means that when Icinga schedules initial service checks it will schedule the first one it finds,
skip the next 6, schedule the next one, and so on... This process will keep repeating until all service checks have been scheduled.
Since services are sorted (and thus scheduled) by the name of the host they are associated with, this will help with
minimizing/equalizing the load placed upon remote hosts.</p>
<p>The images below depict how service checks are scheduled when they are not interleaved
(<span class="emphasis"><em>service_interleave_factor</em></span>=1) and when they are interleaved with the <span class="emphasis"><em>service_interleave_factor</em></span>
variable equal to 4.</p>
<div class="informaltable">
<table border="0">
<colgroup>
<col>
<col>
</colgroup>
<tbody>
<tr>
<td><p>Noninterleaved Checks</p></td>
<td><p>Interleaved Checks</p></td>
</tr>
<tr>
<td align="left" valign="middle"><p> <span class="inlinemediaobject"><img src="../images/noninterleaved1.png" width="500"></span> </p></td>
<td align="left" valign="middle"><p> <span class="inlinemediaobject"><img src="../images/interleaved1.png" width="500"></span> </p></td>
</tr>
<tr>
<td align="left" valign="middle"><p> <span class="inlinemediaobject"><img src="../images/noninterleaved2.png" width="500"></span> </p></td>
<td align="left" valign="middle"><p> <span class="inlinemediaobject"><img src="../images/interleaved2.png" width="500"></span> </p></td>
</tr>
<tr>
<td> </td>
<td align="left" valign="middle"><p> <span class="inlinemediaobject"><img src="../images/interleaved3.png" width="500"></span> </p></td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" title="7.23.6. Maximum Concurrent Service Checks">
<div class="titlepage"><div><div><h3 class="title">
<a name="maxconcurrentchecks"></a>7.23.6. Maximum Concurrent Service Checks</h3></div></div></div>
<p>In order to prevent Icinga from consuming all of your CPU resources, you can restrict the maximum number of concurrent
service checks that can be running at any given time. This is controlled by using the <a class="link" href="configmain.html#configmain-max_concurrent_checks">max_concurrent_checks</a> option in the main config file.</p>
<p>The good thing about this setting is that you can regulate Icinga' CPU usage. The down side is that service checks may fall
behind if this value is set too low. When it comes time to execute a service check, Icinga will make sure that no more than
<span class="emphasis"><em>x</em></span> service checks are either being executed or waiting to have their results processed (where <span class="emphasis"><em>x</em></span>
is the number of checks you specified for the <span class="emphasis"><em>max_concurrent_checks</em></span> option). If that limit has been reached,
Icinga will postpone the execution of any pending checks until some of the previous checks have completed. So how does one
determine a reasonable value for the <span class="emphasis"><em>max_concurrent_checks</em></span> option?</p>
<p>First off, you need to know the following things...</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc">
<li class="listitem">
<p>The inter-check delay that Icinga uses to initially schedule service checks (use the <span class="bold"><strong>-s</strong></span> command line argument to check this)</p>
</li>
<li class="listitem">
<p>The frequency (in seconds) of reaper events, as specified by the <a class="link" href="configmain.html#configmain-check_result_reaper_frequency">check_result_reaper_frequency</a> variable in the main config file.</p>
</li>
<li class="listitem">
<p>A general idea of the average time that service checks actually take to execute (most plugins timeout after 10 seconds, so the
average is probably going to be lower)</p>
</li>
</ul></div>
<p>Next, use the following calculation to determine a reasonable value for the maximum number of concurrent checks that are
allowed...</p>
<p><span class="emphasis"><em>max. concurrent checks = ceil( max( check result reaper frequency , average check execution time ) / inter-check delay
)</em></span></p>
<p>The calculated number should provide a reasonable starting point for the <span class="emphasis"><em>max_concurrent_checks</em></span> variable. You
may have to increase this value a bit if service checks are still falling behind schedule or decrease it if Icinga is hogging too
much CPU time.</p>
<p>Let's say you are monitoring 875 services, each with an average check interval of 2 minutes. That means that your inter-check
delay is going to be 0.137 seconds. If you set the check result reaper frequency to be 10 seconds, you can calculate a rough value for
the max. number of concurrent checks as follows (we'll assume that the average execution time for service checks is less than 10
seconds) ...</p>
<p><span class="emphasis"><em>max. concurrent checks = ceil( 10 / 0.137 )</em></span></p>
<p>In this case, the calculated value is going to be 73. This makes sense because (on average) Icinga are going to be
executing just over 7 new service checks per second and it only processes service check results every 10 seconds. That means at given
time there will be a just over 70 service checks that are either being executed or waiting to have their results processed. In this
case, we would probably recommend bumping the max. concurrent checks value up to 80, since there will be delays when Icinga
processes service check results and does its other work. Obviously, you're going to have test and tweak things a bit to get everything
running smoothly on your system, but hopefully this provided some general guidelines...</p>
</div>
<div class="section" title="7.23.7. Time Restraints">
<div class="titlepage"><div><div><h3 class="title">
<a name="timerestraints"></a>7.23.7. Time Restraints</h3></div></div></div>
<p>The <span class="emphasis"><em>check_period</em></span> option determines the <a class="link" href="timeperiods.html" title="5.9. Time Periods">time period</a> during which
Icinga can run checks of the service. Regardless of what status a particular service is in, if the time that it is actually
executed is not a valid time within the time period that has been specified, the check will <span class="emphasis"><em>not</em></span> be executed.
Instead, Icinga will reschedule the service check for the next valid time in the time period. If the check can be run (e.g. the
time is valid within the time period), the service check is executed.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top">
<p>Even though a service check may not be able to be executed at a given time, Icinga may still
<span class="emphasis"><em>schedule</em></span> it to be run at that time. This is most likely to happen during the initial scheduling of services,
although it may happen in other instances as well. This does <span class="emphasis"><em>not</em></span> mean that Icinga will execute the check!
When it comes time to actually <span class="emphasis"><em>execute</em></span> a service check, Icinga will verify that the check can be run at
the current time. If it cannot, Icinga will not execute the service check, but will instead just reschedule it for a later
time. Don't let this one throw you confuse you! The scheduling and execution of service checks are two distinctly different (although
related) things.</p>
</td></tr>
</table></div>
</div>
<div class="section" title="7.23.8. Normal Scheduling">
<div class="titlepage"><div><div><h3 class="title">
<a name="normalscheduling"></a>7.23.8. Normal Scheduling</h3></div></div></div>
<p>In an ideal world you wouldn't have network problems. But if that were the case, you wouldn't need a network monitoring tool.
Anyway, when things are running smoothly and a service is in an OK state, we'll call that "normal". Service checks are normally
scheduled at the frequency specified by the <span class="emphasis"><em>check_interval</em></span> option. That's it. Simple, huh?</p>
</div>
<div class="section" title="7.23.9. Scheduling During Problems">
<div class="titlepage"><div><div><h3 class="title">
<a name="problemscheduling"></a>7.23.9. Scheduling During Problems</h3></div></div></div>
<p>So what happens when there are problems with a service? Well, one of the things that happens is the service check scheduling
changes. If you've configured the <span class="emphasis"><em>max_attempts</em></span> option of the service definition to be something greater than 1,
Icinga will recheck the service before deciding that a real problem exists. While the service is being rechecked (up to
<span class="emphasis"><em>max_attempts</em></span> times) it is considered to be in a "soft" state (as described <a class="link" href="statetypes.html" title="5.8. State Types">here</a>)
and the service checks are rescheduled at a frequency determined by the <span class="emphasis"><em>retry_interval</em></span> option.</p>
<p>If Icinga rechecks the service <span class="emphasis"><em>max_attempts</em></span> times and it is still in a non-OK state, Icinga
will put the service into a "hard" state, send out notifications to contacts (if applicable), and start rescheduling future checks of
the service at a frequency determined by the <span class="emphasis"><em>check_interval</em></span> option.</p>
<p>As always, there are exceptions to the rules. When a service check results in a non-OK state, Icinga will check the host
that the service is associated with to determine whether or not is up (see the note <a class="link" href="checkscheduling.html#hostcheckscheduling" title="7.23.10. Host Checks">below</a> for
info on how this is done). If the host is not up (i.e. it is either down or unreachable), Icinga will immediately put the service
into a hard non-OK state and it will reset the current attempt number to 1. Since the service is in a hard non-OK state, the service
check will be rescheduled at the normal frequency specified by the <span class="emphasis"><em>check_interval</em></span> option instead of the
<span class="emphasis"><em>retry_interval</em></span> option.</p>
</div>
<div class="section" title="7.23.10. Host Checks">
<div class="titlepage"><div><div><h3 class="title">
<a name="hostcheckscheduling"></a>7.23.10. Host Checks</h3></div></div></div>
<p>Unlike service checks, host checks are <span class="emphasis"><em>not</em></span> scheduled on a regular basis. Instead they are run on demand, as
Icinga sees a need. This is a common question asked by users, so it needs to be clarified.</p>
<p>One instance where Icinga checks the status of a host is when a service check results in a non-OK status. Icinga
checks the host to decide whether or not the host is up, down, or unreachable. If the first host check returns a non-OK state,
Icinga will keep pounding out checks of the host until either (a) the maximum number of host checks (specified by the
<span class="emphasis"><em>max_attempts</em></span> option in the host definition) is reached or (b) a host check results in an OK state.</p>
<p>Also of note - when Icinga is check the status of a host, it holds off on doing anything else (executing new service
checks, processing other service check results, etc). This can slow things down a bit and cause pending service checks to be delayed for
a while, but it is necessary to determine the status of the host before Icinga can take any further action on the service(s) that
are having problems.</p>
</div>
<div class="section" title="7.23.11. Scheduling Delays">
<div class="titlepage"><div><div><h3 class="title">
<a name="schedulingdelays"></a>7.23.11. Scheduling Delays</h3></div></div></div>
<p>It should be noted that service check scheduling and execution is done on a best effort basis. Individual service checks are
considered to be low priority events in Icinga, so they can get delayed if high priority events need to be executed. Examples of
high priority events include log file rotations, external command checks, and check results reaper events. Additionally, host checks
will slow down the execution and processing of service checks.</p>
</div>
<div class="section" title="7.23.12. Scheduling Example">
<div class="titlepage"><div><div><h3 class="title">
<a name="schedulingexample"></a>7.23.12. Scheduling Example</h3></div></div></div>
<p>The scheduling of service checks, their execution, and the processing of their results can be a bit difficult to understand, so
let's look at a simple example. Look at the diagram below - we'll refer to it as we explain how things are done.</p>
<p><span class="inlinemediaobject"><img src="../images/checktiming.png"></span></p>
<p>First off, the <span class="bold"><strong>X</strong></span><sub>n</sub> events are check result reaper events that are scheduled
at a frequency specified by the <a class="link" href="configmain.html#configmain-check_result_reaper_frequency">check_result_reaper_frequency</a> option in
the main config file. Check result reaper events do the work of gathering and processing service check results. They serve as the core
logic for Icinga, kicking off host checks, event handlers and notifications as necessary.</p>
<p>For the example here, a service has been scheduled to be executed at time <span class="bold"><strong>A</strong></span>. However,
Icinga got behind in its event queue, so the check was not actually executed until time <span class="bold"><strong>B</strong></span>. The
service check finished executing at time <span class="bold"><strong>C</strong></span>, so the difference between points <span class="bold"><strong>C</strong></span> and <span class="bold"><strong>B</strong></span> is the actual amount of time that the check was running.</p>
<p>The results of the service check are not processed immediately after the check is done executing. Instead, the results are saved
for later processing by a check result reaper event. The next check result reaper event occurs at time <span class="bold"><strong>D</strong></span>, so that is approximately the time that the results are processed (the actual time may be later than <span class="bold"><strong>D</strong></span> since other service check results may be processed before this one).</p>
<p>At the time that the check result reaper event processes the service check results, it will reschedule the next service check and
place it into Icinga' event queue. We'll assume that the service check resulted in an OK status, so the next check at time
<span class="bold"><strong>E</strong></span> is scheduled after the originally scheduled check time by a length of time specified by the
<span class="emphasis"><em>check_interval</em></span> option. Note that the service is <span class="emphasis"><em>not</em></span> rescheduled based off the time that it was
actually executed! There is one exception to this (isn't there always?) - if the time that the service check is actually executed (point
<span class="bold"><strong>B</strong></span>) occurs after the next service check time (point <span class="bold"><strong>E</strong></span>), Icinga
will compensate by adjusting the next check time. This is done to ensure that Icinga doesn't go nuts trying to keep up with
service checks if it comes under heavy load. Besides, what's the point of scheduling something in the past...?</p>
</div>
<div class="section" title="7.23.13. Service Definition Options That Affect Scheduling">
<div class="titlepage"><div><div><h3 class="title">
<a name="serviceoptions"></a>7.23.13. Service Definition Options That Affect Scheduling</h3></div></div></div>
<p>Each service definition contains a <span class="emphasis"><em>check_interval</em></span> and <span class="emphasis"><em>retry_interval</em></span> option. Hopefully
this will clarify what these two options do, how they relate to the <span class="emphasis"><em>max_check_attempts</em></span> option in the service
definition, and how they affect the scheduling of the service.</p>
<p>First off, the <span class="emphasis"><em>check_interval</em></span> option is the interval at which the service is checked under "normal"
circumstances. "Normal" circumstances mean whenever the service is in an OK state or when its in a <a class="link" href="statetypes.html" title="5.8. State Types">hard</a> non-OK state.</p>
<p>When a service first changes from an OK state to a non-OK state, Icinga gives you the ability to temporarily slow down or
speed up the interval at which subsequent checks of that service will occur. When the service first changes state, Icinga will
perform up to <span class="emphasis"><em>max_check_attempts</em></span>-1 retries of the service check before it decides its a real problem. While the
service is being retried, it is scheduled according to the <span class="emphasis"><em>retry_interval</em></span> option, which might be faster or slower
than the normal <span class="emphasis"><em>check_interval</em></span> option. While the service is being rechecked (up to
<span class="emphasis"><em>max_check_attempts</em></span>-1 times), the service is in a <a class="link" href="statetypes.html" title="5.8. State Types">soft state</a>. If the service is
rechecked <span class="emphasis"><em>max_check_attempts</em></span>-1 times and it is still in a non-OK state, the service turns into a <a class="link" href="statetypes.html" title="5.8. State Types">hard state</a> and is subsequently rescheduled at the normal rate specified by the
<span class="emphasis"><em>check_interval</em></span> option.</p>
<p>On a side note, it you specify a value of 1 for the <span class="emphasis"><em>max_check_attempts</em></span> option, the service will not ever be
checked at the interval specified by the <span class="emphasis"><em>retry_interval</em></span> option. Instead, it immediately turns into a <a class="link" href="statetypes.html" title="5.8. State Types">hard state</a> and is subsequently rescheduled at the rate specified by the <span class="emphasis"><em>check_interval</em></span>
option.</p>
</div>
<div class="section" title="7.23.14. TODO">
<div class="titlepage"><div><div><h3 class="title">
<a name="todo"></a>7.23.14. TODO</h3></div></div></div>
<p><a name="hostintercheckdelay"></a><span class="bold"><strong>Host Check Directives</strong></span></p>
<p>Most of the above applies to host checks as well.</p>
<p>This documentation is being rewritten. Stay tuned for more information in a later beta release...</p>
<a class="indexterm" name="idm140381623736688"></a>
<a class="indexterm" name="idm140381623735168"></a>
<a class="indexterm" name="idm140381623733440"></a>
<a class="indexterm" name="idm140381623731920"></a>
<a class="indexterm" name="idm140381623730160"></a>
<a class="indexterm" name="idm140381623728640"></a>
<a class="indexterm" name="idm140381623726880"></a>
<a class="indexterm" name="idm140381623725392"></a>
<a class="indexterm" name="idm140381623723632"></a>
<a class="indexterm" name="idm140381623722112"></a>
<a class="indexterm" name="idm140381623720384"></a>
<a class="indexterm" name="idm140381623718864"></a>
<a class="indexterm" name="idm140381623717104"></a>
<a class="indexterm" name="idm140381623715520"></a>
</div>
</div>
<div class="navfooter">
<hr>
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left">
<a accesskey="p" href="passivestatetranslation.html">Prev</a> </td>
<td width="20%" align="center"><a accesskey="u" href="ch07.html">Up</a></td>
<td width="40%" align="right"> <a accesskey="n" href="cgiincludes.html">Next</a>
</td>
</tr>
<tr>
<td width="40%" align="left" valign="top">7.22. Passive Host State Translation </td>
<td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td>
<td width="40%" align="right" valign="top"> 7.24. Custom CGI Headers and Footers (Classic UI)</td>
</tr>
</table>
</div>
<P class="copyright">© 1999-2009 Ethan Galstad, 2009-2017 Icinga Development Team, https://www.icinga.com</P>
</body>
</html>
|