File: README.traps

package info (click to toggle)
mon 0.99.2-9
  • links: PTS
  • area: main
  • in suites: etch-m68k
  • size: 908 kB
  • ctags: 299
  • sloc: perl: 9,801; ansic: 778; sh: 372; makefile: 122
file content (97 lines) | stat: -rw-r--r-- 3,215 bytes parent folder | download | duplicates (10)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
The protocol for agents (remote or local monitor scripts)
to deliver failures to the mon server:

Trap consists of tag/value pairs which are separated by newlines. The
first tag must be "pro", which is the protocol version.

Tags which are understood are:

#
# MON-specific tags
# pro   protocol
# aut   auth
# typ   type (0=mon, 1=snmpv1)
# spc   specific type (TRAP_*)
# seq   sequence
# grp   group
# svc   service
# hst   host
# sta   status (opstatus)
# tsp   timestamp as time(2) value
# sum   summary output
# dtl   detail (terminated by \n.\n)
#
# SNMP-specific tags
# ent   enterprise OID
# agt   agent address
# gtp   generic trap type
# stp   enterprise-specific trap type
# tmp   sysUptime timestamp
# vbl   varbindlist (OID = value)
#

SNMP-specific tags do nothing at this time.

Rather than formulating the trap PDU yourself, it's a good idea to use
Mon::Client::send_trap. See the POD for Mon::Client for more details,
or see remote.alert for an example.

If an alert for a watch or service is delivered to a mon server and
its configuration does not include that watch or service, it will use
the default watch/service "default" to deliver the alert. If "default"
is not defined in the mon.cf, the alert will be logged and then discarded.

NOTE: alert/upalert stats are not handled specially for 'default' traps,
so if one unknown alert trap comes in, followed by a unknown upalert
from a different host, then the alert output from mon may be confusing.
Set up a default watch, and use it as a debugging guide to catch random
trap and remind you to update your mon config file.

watch default
    service default
	period wd {Sun-Sat}
	    alert some.alert
	    upalert some.alert -u

See the mon.1 man page for the list of environment variables availble to
monitor and alert programs. One particular environmet variable to note is
the MON_TRAPINTEND variable. This is a colon (:) separated watch
group / service pair which was the intended recipient when a default watch
group and service were invoked for a trap.  This hopefully gives you
some ability to figure out what to do with a trap caught by "default",
and could be exploited to allow a lazy administrator to send useful
information from alerts ;)

There is a (very simple) alert script called "remote.alert" which
delivers a failure detected locally to a remote mon process. This
allows centralization of alert handling, and it allows distributed
mon processes. Pass the mon host name via -H <host> and the port via
-P <port>.

you could use remote.alert to send a trap from one mon server to another
mon server. this can be useful for implementing a hierarchy of mon
servers, where the topmost level serves as the alert management node
for the lower leaf nodes. for example:

mon server "highlevel":

watch pr-internet
    service http_tp
        period wd {Sun-Sat}
            alert mail.alert name@address.com


mon server "lowlevel":

watch pr-internet
    service http_tp
	monitor http_tp.monitor
	interval 5m
	period wd {Sun-Sat}
	    alert remote.alert -H highlevel


when the pr-internet/http_tp service fails on the mon server "lowlevel",
it will send a trap to the mon server "highlevel", which will then send
the email alert.