1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493
|
=head1 NAME
B<developer-guide> - developer's guide to Spong
=head1 DESCRIPTION
This is the developer guild to Spong. It documents the inner workings of the
client and server programs. It also describes the plug-in mechanism of the
B<spong-client> and B<spong-network> so that new check modules can be
developed for these programs.
=head1 PROTOCOLS
This section deals with the low level communication protocols that the clients
use to talk with the B<spong-server>. The Spong and Big Brothers protocols
almost identical. They vary only in the data format.
=head2 SPONG PROTOCOL
The B<spong-server> listens in on port 1998 for status updates from clients.
After a socket has been opened, the client program sends a message with the
following format:
command host service color time[:TTL] summary (\n)
detailed status message line 1 (\n)
detailed status message line 2 (\n)
...
detailed status message line n (\n)
Where:
=over 4
=item command
The command being sent to the spong server indicating
a type of update message or a change in operating status of the client.
=item host
The fully qualified domain name of the host the message
is for.
=item service
The name of the service that the update message is
for.
=item color
The status color of the service (green - ok, yellow
- warning, red - alert).
=item time
The date/time of the update message in epoch time format
(i.e. the number of seconds since 01/01/70, 00:00 AM)
=item TTL
This optional field is the time to live, in seconds, for the status message.
Normally a will become stale (i.e. purple status) after 2 times $SPONGSLEEP
seconds which is the default. See L<spong.conf/"$SPONGSLEEP">. This field will
override the default and keep the status message valid for a longer period of
time.
=item summary
The status summary message field. A short and to the point message that
summarizes the status being returned.
=item detailed status message
The remained lines of the message which will be the detailed information of the
status. Typically it can be the output of the C<df> command or the top processes
by CPU utilization or the detailed responses of network checks.
=back
=head2 BIG BROTHER PROTOCOL
The B<spong-server> listens in on port 1984 for status Big Brother
client updates. After a socket has been opens the client sends a message
with the following format:
command host service color time summary (\n)
detailed status message line 1 (\n)
detailed status message line 2 (\n)
...
detailed status message line n (\n)
Where:
=over 4
=item command
The command being sent to the spong server indicating
a type of update message or a change in operating status of the client.
At present the only command defined is C<status> which indicates a
service status update message.
=item host
The fully qualified domain name of the host the message
is for.
=item service
The name of the service that the update message is
for.
=item color
The status color of the service (green - ok, yellow
- warning, red - alert).
=item time
The date/time of the update message in standard date
format (i.e. Thu Jan 1 00:00:00 UTC 1970)
=item summary
The status summary message field.
=item detailed status message
The remained lines of the message
which will be the detailed information of the status. Typically it can
be the output of the df command or the top processes by CPU utilization
or the detailed responses of network checks.
=back
=head1 MODULES
B<spong-client>, B<spong-network>, B<spong-message> and B<spong-server> use
various routines which are coded as modules. When the programs are
initializing, they determine which modules are going to be required. The
programs then go out and load each of the modules from the library directory.
When the modules are loaded they register themselves with the plug-ins
registry. The plug-in registry is the mechanism that the client programs use
to keep track of the modules into order to run them.
=head2 SERVER MODULES
B<spong-server> has a hook that allows external programs access to the incoming
status updates coming from Spong client programs. The hook takes the form of
Server Data modules which are called after spong-server stores the status
update in it's database.
B<spong-server> passes all of the information of the update message in addition
to the current event status duration to the Data Module. The modules should
do any processing that they need to do in as short a time as possible. This is
to minimize the resource overhead with lots of simultaneous status updates
arrive at same time.
Debugging messages and error messages can be printed by using the
&main::debug() and &main::error() functions respectively. If the module
develops a fatal error, it should terminate using the die() or croak()
functions depending on ones preference. Modules should just return upon a
successful invocation.
NEW
There are two new types of Spong Server plugin modules: predata and postdata
modules. The predata modules are called just after a message is received but
before the normal Spong processing is done. And a post module is called after
the Server's normal processing. i
Also in contrast to the old Spong Server data plugin modules all incoming
messages a sent to the new predata and postdata modules, not just status
messages. Predata modules can even flag a message to tell the Spong Server to
drop the message and ignore it. This will allow you a great deal more access
and control of the Server's incoming message processing.
Modules are called a predictable, controlable order. There are run in the order
of the sorted registry hash keys. This allows you to create modules that have
run dependencies.
And there is a new API to go along with the new modules. Now parameters are
passed in a Perl hash. This simplifies the module interface to the server. The
message type is determined by the C<cmd> field. Message hash details are
in the L<spong-server-mod-template> document.
See L<spong-server-mod-template> for an example of how to code a
B<spong-server> Server Data Modules
=head2 CLIENT MODULES
Client modules define checks which are to be done on the host that the
B<spong-client> program is running on. The module's check function is called
without any parameters. Client modules are expected to issue any systems
command and parse the output in order to determine the service status.
Any threshold variables needed for warning and alert level trigger need to be
defined and placed into the F<SPONG/etc/spong.conf> file. The threshold
variable need to be uniquely named and should be named according to the type of
check being done (i.e. $DISKWARN or $DFWARN for disk checks and $CPUWARN for
CPU checks, etc.).
Once the service status and messages have been determined the module
can call the C<E<amp>main::status()> function in order to send the
information back to the spong-server. See L<"Status Function"> for more
information.
=head2 NETWORK MODULES
Network modules defined checks that to be done on hosts over the network
to ensure that a network service is running. The modules are called with
the name of the host the check is to be done to. The modules is also expected
to put an alarm wrapper around the code that performs the check. This is
to prevent excessive delays dues to lost communications. It is suggested
that 10 seconds be used for the alarm setting.
The module should not call the E<amp>main::status() function directly.
B<spong-network> has a mechanism for rechecking services that are reported down
on an initial check. If the recheck mechanism is engaged, "red" statuses will
be downgraded to "yellow" until a failure count threshold is reached. The the
services will be reported as "red".
After the status condition has been determined the check function should return
three parameters:
=over 4
=item STATUS
The status color either "red", "yellow", or "green".
=item SUMMARY
A short line line summary of the status
=item MESSAGE
more detailed information (can be multi-lined)
=back
These parameters are the same that will be passed to the C<E<amp>main::status()>
command. See L<"Status Function"> for more information on these parameters.
The network modules have two support functions available,
C<E<amp>main::check_tcp()> and C<E<amp>main::check_simple()>, which can
simplify simple TCP port checks.
C<E<amp>main::check_tcp( host, port, data );>
Where the arguments are:
=over 4
=item HOST
The name or ip address of the host to be checked.
=item PORT
The name, or port number, of service to connect with.
=item DATA
The data to be send to the host after the port is opened.
=back
The function C<E<amp>main::check_tcp()> will make a connection to
the given PORT on the HOST and send a message (DATA). It then returns what
it gets back to the caller.
C<E<amp>main::check_simple( host, port, send, check, service)>
Where the arguments are
=over 4
=item HOST
The name or ip address of the host to be checked.
=item PORT
The name of the port to connect with.
=item SEND
The message to sent to the host after the port is opened.
=item CHECK
A perl regular express to be used to validate the response return
by the host.
=item SERVICE
The name of the service being check. It is used in the summary
and detailed status messages.
The function C<E<amp>main::check_simple()> is a generic TCP port checking
routine. This will go out connect to a given port (using C<E<amp>main::check_tcp()>) and
check to make sure you get back expected results. The function returned
three parameters: STATUS, SUMMARY and MESSAGE as detailed above. The return
values of this function can returned as the necessary returned values of
the check module.
=back
=head2 MESSAGE MODULES
Message modules are function called to send a notification message to
a contact on a specific service or service. The messaging functions are called
with an the contacts identifier for the messaging service (i.e. the page PIN
code of a paging provider). The messaging module is expected to take care of
all of the data formating and communications logic to send a notification
message to the contact.
The messaging functions has access to these global variable in order format
a notification message:
=over 4
=item I<$color>
The status color of the message
=item I<$host>
The fully qualified domain name of the host
=item I<$time>
The date and time of the message being sent. (format is
epoch time or time())
=item I<$message>
A one line summary status line
=item I<$duration>
The current duration of the current status in seconds.
(a zero duration indicates a change in status)
=back
There are two support functions that be used to format a message and send the
message via e-mail: C<E<amp>main::email_status()> and
C<E<amp>main::email_mini_status()>. Both functions format e-mail message to be
send to RECIPIENTS, but C<email_mini_status()> sends out a shorter mail message
which is more suitable for SMS and smaller alpha pagers.
Both functions are called thusly:
C<E<amp>main::email_status( recipient, flags )>
C<E<amp>main::email_mini_status( recipient, flags )>
Where the arguments to the functions are:
=over 4
=item RECIPIENT
one or more e-mail recipients which placed in the to: line
of the mail message.
=item FLAGS
flags to alter the formating of the message.
=back
The only flag current defined is 'shortsubject'. This prevents $color,
$hostname, and $summary from being placed on the "subject:" line.
=head1 CREATING MODULES
Creating the actual modules is very trivia to do. Create your module by
following the appropriate template from below.
=over 4
=item *
L<spong-client Module Template|spong-client-mod-template>
=item *
L<spong-network Module Template|spong-network-mod-template>
=item *
L<spong-message Module Template|spong-message-mod-template>
=item *
L<spong-server Module Template|spong-server-mod-template>
=back
Then place your template module file into the appropriate directory below.
=over 4
=item *
B<spong-client> - F<LIBDIR/Spong/Client/plugins>
=item *
B<spong-network> - F<LIBDIR/Spong/Network/plugins>
=item *
B<spong-message> - F<LIBDIR/Spong/Message/plugins>
=item *
B<spong-server> - F<LIBDIR/Spong/plugins>
=back
Then test your modules by running the program with the --debug parameter to see
if it is operating properly.
=head1 Status Function
E<amp>main::status( SERVERADDR, HOST, SERVICE, COLOR, SUMMARY, MESSAGE )
The arguments to the C<E<amp>main::status()> function are:
=over 4
=item SERVERADDR
Should be I<$SPONGSERVER>.
=item HOST
The full hostname of the machine being reported.
=item SERVICE
The a short name that describes the service
that you are reporting on.
=item COLOR
The color of the status being reported, either "green", "yellow", or "red".
"green" denotes an OK status, there are no problems and everything is within
normal parameters. "yellow" denotes a warning status, a abnormal situation that
has which may be need to be looked at or a parameter has changed (up or down)
towards a critical level. "red" denotes an alert status, a critical situation
that has arisen and needs immediate attention or a parameter has changed (up
or down) to a critical level.
=item SUMMARY
A short one line summary of the status. This should be a short and concise
summary of the current situation of the service. The simplest form is to
say "Service is OK" or "Service is down". Another form is to display current
information (like system uptime, number of job and users) and additional
text for warning and alerts (i.e. "uptime - 123, jobs - 123, users - 123, cpu
load level is at 3.2").
If you are reporting on multiple sets of like items (like file partitions or
processes), report the names of those items that are abnormal, (i.e.
"filesystems: / at 99%, /tmp at 100%").
=item MESSAGE
This is the place to put detailed information about the status of the service.
Typically this will be the output of the system commands or function calls. For example, it could be the 10 jobs by cpu usage in a C<ps> command, or the output
of a df command for disk checking.
There are no limitations on the contexts of the field. You can include URL's
that link to another monitor package or take you to an administration web page
for the service in question.
=back
=head1 SEE ALSO
L<spong-server>, L<spong-client>, L<spong-network>, L<spong-message>,
L<spong-server-mod-template>, L<spong-client-mod-template>,
L<spong-network-mod-template>, L<spong-message-mod-template>, L<spong.conf>
=head1 AUTHOR
Stephen L Johnson, <F<sjohnson@monsters.org>>
|