
|
\documentclass{book}
\usepackage{index}
\makeindex
\begin{document}
\newcommand{\unknowncharacter}[1]{}
\author{Thomas Erskine}
\title{Remstats 1.00a4}
\date{Mon Sep 10 15:18:31 EDT 2001}
\maketitle
\tableofcontents
%------------------------------------ index.pod ---
\chapter{About Remstats}
\index{About Remstats}
Remstats is a system of programs to:
\begin{itemize}
\item gather data from servers and routers,
\item store and maintain the data for long periods,
\item produce graphs and web-pages tieing them together, and
\item monitor the data for anomalous behavious and issue alerts
\end{itemize}
It's built on
RRDtool (see \textbf{http://ee-staff.ethz.ch/\~{}oetiker/webtools/rrdtool/}).
There is a proto-FAQ (see the proto-FAQ section) ; feel free to contribute. There's
also a to-do list (see the to-do list section) to give some idea what might be coming.
\subsection{Where to get it}%
\index{Where to get it}
The best available version is 0.13.1. You can get it at
http://remstats.sourceforge.net/release/src/remstats-0.13.1.tar.gz (see \textbf{http://remstats.sourceforge.net/release/src/remstats-0.13.1.tar.gz}).
The current version is 1.00a4. (This version may not be
available yet, as I will push out new documentation before the
new release, to make additions/corrections available as soon
as possible.) You can get other versions of remstats from the
source archive (see \textbf{http://remstats.sourceforge.net/release/src}). Since almost all of it is
written in perl (see \textbf{http://www.perl.org/}) scripts, there is no binary version.
\subsection{How to get started}%
\index{How to get started}
First, you should make sure that you have all the requirements (see the requirements section) .
Then read the installation docs (see the installation docs section) . Then read the
server installation docs (see the server installation docs section) .
Then check out run-remstats (see the run-remstats section) which runs almost everything else and documents
how all the pieces work together.
Thank-you (see the Thank-you section) .
%------------------------------------ releasenotes.pod ---
\chapter{Release Notes}
\index{Release Notes}
\section{Release Notes for Remstats version 1.0a3}%
\index{Release Notes for Remstats version 1.0a3}
It's always a good idea to run check-config (see the check-config section) after changing any of the
config-files, but it's also a good idea to run it after doing an upgrade,
especially when, as in this version, there are changes to the config-files.
Mostly small new features and bug-fixes, except:
\begin{itemize}
\item \textbf{Incompatible:} re-written alert-sending mechanism. Permits
easily written new methods of sending alerts, by separating the alert-text generation
(see alerter (see the alerter section) ) from the alert-sending (see alert-email (see the alert-email section) and alert-winpopup (see the alert-winpopup section) ).
The new alert-destination-map (see the alert-destination-map section) config-file permits
mapping an alert-destination to different addresses depending on the time-of-day,
day-of-week, ... and its alias facility permits sending to a list of addresses
which may use different methods of sending the alert.
\item \textbf{Incompatible:} The unix-status-server (see the unix-status-server section) 's do\_df now returns
\textbf{bytes} not \textbf{K-bytes}. This avoids silliness in the graphs saying that
you've got 20k kbytes free. It may have been correct, but it wasn't
intuitive. You can multiply all the old numbers by 1000 to convert RRDs.
\item \textbf{Warning:} new-config (see the new-config section) now copies the configuration files which
you are likely to change, so that your changes won't be overwritten by an
update to remstats. Unfortunately, as all updates (including this one)
will overwrite config-base, so people upgrading from a previous version
of remstats should convert the following files from symlinks to config-base
into copies of those files:
\begin{itemize}
\item alerts alert-destination-map general html links tools
\end{itemize}
You can do this by running the supplied convert-config-links (see the convert-config-links section) script
\textbf{BEFORE INSTALLING THIS VERSION}
\end{itemize}
\subsection{New Features}%
\index{New Features}
\begin{itemize}
\item the new datapage-status (see the datapage-status section) script generates datapages for each host, which will
show the current values of all rrd variables. There is a new tool in the default
tools config-file (see the tools config-file section)
which will show this page, and it's been added to the defaults generated by the
new-xxx-hosts (see the new-xxx-hosts section) programs.
\item Host templates (see the Host templates section) . So that you can configure
similar hosts like:
\begin{verbatim}%
desc whatever
template their-template\end{verbatim}
Can also be used to make changing some things for many hosts easier. E.G., you
could have a template, say \texttt{default-nt-status-server} which contained:
\begin{verbatim}%
nt-status-server some-host\end{verbatim}
and configure all hosts which use c$<$some-host$>$ like:
\begin{verbatim}%
template default-nt-status-server\end{verbatim}
\item New availability-report (see the availability-report section) and
availability-report.cgi (see the availability-report.cgi section) and config-file
availability (see the availability section) for reporting on "availability".
\item New nt-status-server (see the nt-status-server section) and nt-status-collector (see the nt-status-collector section) and RRDs for them
(ntactivity, ntmemory, ntpaging, ntnetwork and ntlogicaldisk-*).
\item New cleanup (see the cleanup section) program to remove stale files.
\item New new-snmp-hosts (see the new-snmp-hosts section) now adds other rrds than snmpif-*
\item New new-unix-hosts (see the new-unix-hosts section) program to add hosts which are running the
unix-status-server (see the unix-status-server section) with the apropriate rrds.
\item ought to work with perl 5.6 now. I'm not using perl 5.6 on the main
collector yet, but it seems to install correctly on a test system.
\item run-remstats (see the run-remstats section) now checks all configuration sub-directories to figure
out if anything has changed, so you ought to be able to just edit files and
the changes will get caught on the next run.
\item remstats internal instrumentation allows monitoring remstats collectors,
for now. More later. Look at the pseudo-host \_remstats\_.
\item removed old Overall Index, since I never looked at it, and wrote a new
RRD Index, which I was always wanting.
\item new nt-discover (see the nt-discover section) program to discover and add NT systems
\textbf{Note}: this adds the new discovery config-file (see the discovery config-file section) which
must be locally configured. I won't even attempt to guess at values here.
\item The old \texttt{alertflag} entry in the html config-file (see the html config-file section) has
been replaced by three new entries: \texttt{alertflagcritical}, \texttt{alertflagerror} and
\texttt{alertflagwarn}, allowing e.g. different colors for the different levels of alert.
\item datapage.cgi (see the datapage.cgi section) now does variable substitution properly
for HTML macros. See the example datapage upss.page under /home/remstats/etc/config/datapages.
\item datapage.cgi (see the datapage.cgi section) and dataimage.cgi (see the dataimage.cgi section) have two
new commands: \texttt{alertstatus} and \texttt{alertvalue} to fetch alert statuses and values.
To be used on forthcoming status pages.
\end{itemize}
\chapter{Release Notes}
\index{Release Notes}
\section{Release Notes for Remstats version 0.13.1}%
\index{Release Notes for Remstats version 0.13.1}
I fixed a minor buglet in 0.13.0 which was noticed shortly after release.
I was annoyed enough with it that I made 0.13.1.
\chapter{Release Notes}
\index{Release Notes}
\section{Release Notes for Remstats version 0.13.0 (AKA 0.12.2)}%
\index{Release Notes for Remstats version 0.13.0 (AKA 0.12.2)}
There are lots of little improvements, which are detailed in the
Change History (see the Change History section) , which I'm not going into here. The main
incompatible changes are:
\begin{itemize}
\item The configuration structure (\$main::config) now has the graphs stored
under \$main::config\{RRD\}\{\$wildrrd\}\{GRAPH\} instead of \$main::config\{GRAPH\},
so there won't be problems with having the same graph-name defined under two
different rrds. This will only affect you if you've been writing your own
code for remstats, like a new page-maker. I thought that the bug was
annoying enough and difficult to figure out when it was triggered that
the incompatibility was worth the change.
\item After typing \$main::config\{CUSTOMGRAPH\} instead of \$main::config\{CUSTOM\}
one too many times, I renamed \$main::config\{CUSTOM\} to \$main::config\{CUSTOMGRAPH\}
which is what it should have been all along. Again, this should only affect
you if you've been writing your own remstats code, like a new page-maker.
\item Changed default location for datapages to /home/remstats/etc/config/datapages, so that
all the configuration, including the datapages are together.
\item Removed the general config-file directive \texttt{pagesas} since all the
generated pages are CGIs now. Check-config will abort if you still have it.
Just delete that line in the general config-file.
\item \texttt{use strict} in all the scripts (unless I missed some) in preparation
for perl 5.6, which doesn't like \texttt{use vars}. Shouldn't bother you unless
you've been writing remstats code, in which case, you probably know what to
do.
\item To deal with alert templates (see below), you'll need to manually fix
your config-dirs. For each one, you need to:
\begin{verbatim}%
su remstats
cd your-config-dir
cp /home/remstats/etc/config-base/alert-template-map .
mkdir alert-templates
cp /home/remstats/etc/config-base/alert-templates/* alert-templates\end{verbatim}
\item remoteping-collector has been modified to return the server-name
instead of a number to differentiate the data from different servers.
There's also a new remoteping-* wildcard RRD to make it more usefull.
\end{itemize}
\subsection{Customgraphs on host-index pages}%
\index{Customgraphs on host-index pages}
The new \texttt{customgraph graphname} directive for host config-files
permits you to add a customgraph to a host-index page. (Thanks Marek.)
\subsection{graph.cgi - remstats graphs anywhere}%
\index{graph.cgi - remstats graphs anywhere}
Like it says, using the new graph.cgi (see the graph.cgi section) , you can put
remstats graphs on any page you want.
\subsection{Views }%
\index{Views }
You can now define your own pages with page-layout of your choice
using views (see the views section) . (Thanks to Marek and Thorsten and Matt and
anyone else I've forgotten.) Don't forget to add view-writer (see the view-writer section) to the
list of pagemakers if you've changed the default.
\subsection{ping-* rrd}%
\index{ping-* rrd}
You can now ping different interfaces on a host separately (Thanks Steve)
\subsection{fileage section for unix-status-server}%
\index{fileage section for unix-status-server}
This allows you to fetch the last-modification-time for specified files. It
was written to allow remstats to monitor lock-files to check for stale locks.
There is no included rrd using it as lock-files are all over the place.
\subsection{port-collector can collect data from results}%
\index{port-collector can collect data from results}
The port-collector (see the port-collector section) has always been able to send a string to remote services
to that it could tell if they were working correctly. Now it can pull values
for RRDs and status-files from the results as well. I've included a sample
rrd (weathernetwork) and script (weathernetwork) to collect current weather data
for Ottawa. Look at the updated docs for
scripts config-files (see the scripts config-files section) .
\subsection{new script - snmpif-description-updater}%
\index{new script - snmpif-description-updater}
The snmpif-description-updater (see the snmpif-description-updater section) will keep the descriptions on snmpif-*
RRDs up-to-date with whatever you've set as ifAlias for that interface.
(Thanks Steve Francis.)
\subsection{Alert Templates}%
\index{Alert Templates}
This feature allows you to customize the alert messages by addressee
or by RRD. Look at the docs in
alert-template-map (see the alert-template-map section) and
alert-templates (see the alert-templates section) .
\subsection{Autoconf-like configure}%
\index{Autoconf-like configure}
You can now do:
\begin{verbatim}%
./configure
make
make install\end{verbatim}
for the beginning of of the install (see the install section) .
\chapter{Release Notes}
\index{Release Notes}
\section{Release Notes for Remstats version 0.12.1}%
\index{Release Notes for Remstats version 0.12.1}
Ideally, this document will only have to tell you about the
great new features of remstats in this version.
Not this time.
In addition, due to various stuff (read the Change History (see the Change History section) ),
this covers changes since version 0.11.1.
\subsection{Configuration File Replaced by Configuration Directory}%
\index{Configuration File Replaced by Configuration Directory}
The old "one huge configuration file" has been replaced by a directory
of files and sub-directories. (See the new configuration docs (see the new configuration docs section)
for details.) This means that most programs don't need to read and parse
everything, including stuff that they're not going to use. It also
makes it easier to find things, as you can go directly to the file that
has what you want, e.g. details on a particular host. It also made
possible the newly revamped replacements for \texttt{make-ping-hosts},
\texttt{make-port-hosts} and \texttt{make-snmp-hosts}, which will insert their
additions directly into the appropriate configuration files.
There is a new script, split-config (see the split-config section) , which will take your old
config-file and a new name and generate a new config-dir from it.
On a related note, I broke the groups (see the groups section) line
out of the \texttt{general} config-file
into its own file. It's easier to see what you've got. \texttt{split-config}
will do this for you. Also the (undocumented) [html] section will
absorb large portions of the [general] section which really belong to
wep-page generation.
If you've made your own collector, you'd better look at the new
skeleton-collector for the required changes. Just change \texttt{read\_config}
to \texttt{read\_config\_dir}, with extra args. There's also documentation
on how to write your own collector (see the collector section) .
\subsection{do-remstats replaced by run-remstats}%
\index{do-remstats replaced by run-remstats}
The old \texttt{do-remstats} shell-script and all the kludgy shell-scripts
that went with it and the \texttt{watchdog} and \texttt{lockfile} scripts have
all gone away. The replacement run-remstats (see the run-remstats section) does everything they
did and does it correctly. It's also configurable, so you don't need
to modify the scripts to change which collectors you want to run, e.g.
A new feature of \texttt{run-remstats} configurability is that you can have
it run the \texttt{ping-collector} before everything else and not bother
trying hosts that didn't answer it. You can also choose which
collectors (see the collectors section) , monitors (see the monitors section) and pagemakers (see the pagemakers section) to run.
\subsection{CGI scripts and non-default config-dirs}%
\index{CGI scripts and non-default config-dirs}
At the moment, the supplied CGI scripts don't deal with non-default
config-dirs. I do consider this to be a problem, but I need to get this
release out to deal with other serious installation problems.
You can work-around this by editing the installed CGI scripts and
putting in the correct definition for \$config\_dir, near the top.
\subsection{plugin-collector is gone}%
\index{plugin-collector is gone}
It was an inefficient, difficult-to-configure, kludge and isn't
needed anymore with the new run-remstats.
\subsection{pre-release testing automated}%
\index{pre-release testing automated}
You won't see it, but I hope you'll all notice the improvement
in release quality.
%------------------------------------ bugs.pod ---
\section{Known Bugs for version 1.00a}%
\index{Known Bugs for version 1.00a}
\begin{itemize}
\item Alerts in general - The alerts shown by the alerts.cgi (see the alerts.cgi section) page never get expired.
If a hub is down, then you'll get alerts for everything behind it. Ought to only get
the alert for the hub.
\end{itemize}
\section{Known Bugs for version 0.12.2}%
\index{Known Bugs for version 0.12.2}
\begin{itemize}
\item Neither new-snmp-hosts, nor snmp-collector use get\_ifname, with the consequence
that neither copes well with "oddly" named interfaces, say with spaces in them. Fixed
in 0.12.3.
\end{itemize}
\section{Known Bugs for version 0.12.1}%
\index{Known Bugs for version 0.12.1}
\begin{itemize}
\item CGI scripts don't work with non-default config-dirs. I consider
this a bug, but I need to get this release out now to deal with
serious installation problems with previous releases. For now, do the
following for each config-dir:
\begin{verbatim}%
\% make install-cgis CONFIGDIR=/wherever/you/put/it\end{verbatim}
\item run-remstats only checks the config-dir for change, not the
subdirectories. For now, just \texttt{touch config-dir} whenever you make a change.
\item customgraphs are completely broken. Urgh. Upgrade.
[FIXED in 0.12.2]
\item You can't have two graphs of the same name, even in different rrd
definitions. This is just flat-out wrong and will be fixed. Unfortunately,
the fix will mean even longer file-names, so I hope nobody has some old system
with the 14-character limit. [FIXED in 0.12.2]
\item graphs with descriptions can't have quotes in the description.
[FIXED in 0.12.2]
\end{itemize}%------------------------------------ todo.pod ---
\section{To-Do List for Remstats}%
\index{To-Do List for Remstats}
\section{High Priority}%
\index{High Priority}
\textbf{134 20010829 [LOW]} - make header\_bar (in htmlstuff) do the link making, if available
and fix whatever uses it not to.
\textbf{133 20010829 [LOW]} - add an option to make nt-discover update old hosts with a standard set of RRDs,
even if the hosts are already known
\textbf{132 20010824 [HIGH]} - BUG: get rid of the spikes in uptime from the unix-status-server
\textbf{131 20010824 [MED]} - make status pages for each host, group and for all hosts using
the new alertstatus and possibly alertvalue.
\textbf{130 20010823 [HIGH]} - add an $<$RRD::EXEC ...$>$ tag to rrgcgi. To by used in host index
pages (see 129).
\textbf{128 20010629 [MED,HOLD]} - custom, configuration-supplied info per rrd which is simply available
wherever it makes sense, e.g. in alerts.
- first make sure someone has a use for it.
\textbf{127 20010622 [MED]} - graph data together with historical data. This
will probably mean either populating another rrd with historical averages,
temporarily or permanently, or modifying rrdtool. The former is certainly
simpler to do, given my knowledge of the internals of rrdtool. However,
it needs to have another rrd for each period? Need to keep the same data
over some longer period, a multiple of the period of interest, as well as
the averages, from period to period.
\textbf{122 20010330 [HIGH]} - rrd prog-* which tells if a particular named process is running, using
the ps section of the unix-status-collector.
\textbf{121 20010202 [MED,HOLD]} - how about an discovery program, to find and identify hosts and
then run the appropriate new-xxx-hosts scripts to add them?
DONE 20010608 - nt-discover to find and add NT boxen
\textbf{115) 20001229 [HIGH]} - need docs on errors. Specifically, when
run-remstats kills a collector for taking too long. And where to find
the output of the killed collector.
\textbf{112) 20001212 [LOW,HOLD]} - web-based remstats configurator. Needs
to consider security, at least from the point of view that you don't
want to lose your configuration. The most important part is hosts.
A lot of the rest doesn't have to be changed, or only once.
\textbf{111) 20001212 [LOW,HOLD]} consider grafting on (at least links to)
some kind of system configuration interface. For configuring the
mmonitored entities, not remstats.
\textbf{110) 20001212 [LOW,HOLD]} consider problem-fixing interface. It'd be nice to
try to fix things if there is a known way to do so. A simple kludge
would be to add another method to the alert-destination-map which
deals with problems that it knows about, possibly invoking plugins for
specific alerts.
\textbf{109) 20001212 [MED]} nt-log-collector, with modules for event-logs
and ntmail logs.
\textbf{70) 20000407 [HIGH]} CGI scripts need to have a way to deal with
alternate config-files, and graph-writer needs to tell them if they
can't work it out themselves. Otherwise, people need to be told to
do multiple installs of the CGI scripts, which might be the best way.
\begin{verbatim}%
make install-cgis CONFIGDIR=config-xxx\end{verbatim}
Not that painfull, but wastefull and makes upgrade messier.
- I don't like the multiple-install method, but any other method needs
a way of getting configuration information into the CGI scripts. Any
method which passes info in via the URL or form fields is out: too unsafe.
The only other method I can think of is to read a configuration file in the
same directory as the CGI script. This ought to be safe from modification,
or your web-site is waiting to be mutilated. The other part to consider is
whether any part of the info in the CGI config-file is sensitive. I.E. do
we have to protect it in some way.
- Configuration file in the same directory won't work either, you'd still
have to install the cgi's multiple times. I'm starting to think that
multiple installations may be the only safe thing to do.
\textbf{99) 20000619 [HIGH]} make unix-status-collector send the directories
that we want df for and make unix-status-server do "df /dir1 /dir2"
to get them, and pull them off one line at at time. This is to deal with
things like disconnected NFS-mounted directories hanging df when we do
just a bare "df".
\textbf{86) 20000419 [HIGH]} trends analysis
\textbf{87) 20000419 [HIGH]} alerts based on trends analysis and historical
data, like one-week average and standard-deviation, ... (for Steve)
\textbf{106) 20000922 [MEDIUM]} make a file-collector. Similar to the
log-collector, only for small, local files. Slurp the file into
memory, match patterns and pull out values. The data line in an
rrd definition would be like:
\begin{verbatim}%
source file
data VARNAME GAUGE:600:0:U FUNCTION PATTERN(WITH)PARENS\end{verbatim}
in fact, this would share so much code with the log-collector that it
might be worth combining the two. This allows collection from things
like Linux's /proc.
\textbf{98) 20000619 [MEDIUM]} add group index files and store hosts under
group directories. For easier application of access-controls. (for Florian)
\textbf{2) ???????? [MEDIUM-INPROGRESS]} make rrd munger, like copyrrd was supposed to be
use dump/restore and process the xml form (rrddump-munger)
what functions do we need? Make one script for each function.
\begin{itemize}
\item add a DS (less important, as we can just make a new rrd)
\item remove a DS (less important, as we can ignore it)
\item add an archive
\item extend an archive
\item change CF of an archive
\item remove an archive
\item filter data within an archive
\begin{itemize}
\item change NaN to number/max/min
\item change $>$\# to NaN/max/min
\item change $<$\# to NaN/max/min
\end{itemize}
\end{itemize}
\section{Lower Priority}%
\index{Lower Priority}
\textbf{102) 20000912 [LOW]} add see-also to host config, which will
materialize links in the host header. Config line like:
\begin{verbatim}%
seealso host:xyzzy http://www.somewhere ftp://ftphost\end{verbatim}
the special "host:" pseudo-URL gets changed to a link to the
remstats page for that host.
\textbf{103) 20000915 [MEDIUM]} make-path doesn't work with non fqdn hosts
Make it read the configuration, so it can look up the IP number in
the host config and use that if it's defined. Otherwise, default to
gethostbyaddr.
\textbf{107) 20000922 [MEDIUM]} extra status header lines for hosts, from
specified STATUS files creaded by the various collectors. Add
lines to host definition like:
\begin{verbatim}%
extrastatus "STATUS DESCRIPTION" STATUS-FILE-NAME\end{verbatim}
\textbf{60) 20000328 [MEDIUM]} replace route-collector with something which
scales. SNMPwalking bgp4PathAttrBest doesn't scale to large Internet
routers with 400 peers, taking over an hour to complete. (see also 61)
- look at a script to follow the output of zebra. That's a lot of
overhead though. Easy if zebra is solid.
- How difficult can it be to make a native BGP listener? I'm not clear on
the protocol, but it doesn't look too bad.
\textbf{45) 20000121 [MEDIUM]} make snmp-collector send only one packet per host
- test and make sure that we do get back whatever succeeded. I vaguely
remember that it didn't work. [Later: at least under UCD snmp under linux,
if an item isn't implemented in the MIB, you get back NOTHING. Specifically,
look for the non-unicast packet counters as well as something else; you get
nothing back. This isn't good.]
- have to re-write snmp-collector completely, which isn't that bad an idea.
This means a two-pass structure. On pass one, we construct the complete query
and then send it. On pass two, we examine all the results and format them.
\textbf{9) ???????? [MEDIUM-TESTING]} make alerts take connectivity dependence into account
- add "via" line to host section to deal with hubs and switches [DONE]
- I think it's done. See what happens next outage.
\textbf{42) 20000114 [MEDIUM]} snmp-collector mod to allow summary data collected
from a walk and then filtered as a single data-point. E.G. specify a rrd "oid"
like:
\begin{verbatim}%
walk count ifOperStatus = 1\end{verbatim}
would produce a count of the number of interfaces on that device that
were active (i.e. had a live device plugged into them). Or a similar one
would let you count BGP routes, or arp addresses, ...
- Unfortunately, from experience with the snmp-route-collector, this is
going to be slow for anything with a large number of items.
\textbf{43) 20000114 [MEDIUM]} parallelizing the collectors, at least on a
group basis, preferably host or group.
- collectors must accept \texttt{-G} and \texttt{-H} flags to request processing of
the specified group or host, respectively. Run-remstats needs to fork
extra processes according to a config-file line, "parallel group" or
"parallel host".
- 20010831 TEE - implemented -H flags for all collectors except for the
remoteping-collector, which I'm not using anyway right now.
\textbf{51) 20000216 [LOW]} need a way to specify URL for port-http. The root page
doesn't always exist.
\textbf{37) 19991216 [LOW]} traceroute sometimes shows incorrect routing, which
confuses the topology-monitor, causing false positives
\textbf{50) 20000215 [LOW]} make inventory script. Runs uname
(for hardware and software), \texttt{ifconfig -a}, \texttt{netstat -nr}, \texttt{hostname}
and any others I can think of to collect configuration info.
Then figures out the versions of important software, e.g. run \texttt{perl -v},
\texttt{gcc -v ...} Make a subdir to put it in and make a tool definition to get it
onto the host pages.
- looks like the beginning of a discovery script.
\textbf{62) 20000329 [LOW]} make different markers for different levels
of alert on quick-index.
\textbf{69) 20000406 [LOW]} is there any use for write\_environment in
check-config?
\textbf{97) 20000616 [LOW]} make port-collector or check-config complain about
having a script with ok/warn/error/critical patterns but no send string.
The port-collector will ignore patterns unless there is a send string.
\section{On Hold}%
\index{On Hold}
Usually waiting for next major release, or trapped by something else.
(in priority order)
\textbf{40) 20000104 [MEDIUM-HOLD]} consider some form of access-control for servers
- hash-based "password"
- ssl tunneling ought to work for everything except SNMP
- what does this buy? With the various servers run under tcp\_wrappers
an attacker must either gain access to the remstats collector
machine or spoof a tcp session from them. If you've been "owned"
you've got bigger problems. If the attacker spoofs a session with
a remstats server, tcp-wrappers will insist that it must come from
one of the allowed hosts, so that's where the stolen output will go.
This is only usefull to the attacker if they have access to the
remstats collector machine or if they can sniff the traffic between
the collector and the server. The only data loss possible is with
the log-server which keeps state. (Ignoring DOS attacks which are
always a problem.)
- unless someone needs this, it's on hold
\textbf{6) ???????? [LOW-NEEDS:2-HOLD]} increase CA3 resolution
- need rrd munger (2)
\textbf{10) ???????? [LOW-INPROGRESS-HOLD]} make graph of connectivity
\textbf{13) ???????? [LOW-INPROGRESS-HOLD]} snmp trap listener to update status files
- needs filter to be usefull [DONE]
- I haven't seen any useful traps so this is on hold.
\textbf{14) ???????? [LOW-NEEDS:2-HOLD]} make rrd structural changes in config file
get applied to the rrds.
- some taken care of with snmpif-setspeed, but need a more general solution
- look at new XML output of rrddump
\textbf{39) ???????? [LOW-HOLD]} make RRD dumper, to put data out in a form that can
be loaded into a database
- I don't need it, per se, but it might be easier than writing the
availability report generator.
\textbf{52) 20000215 [LOW]} make a makegraph.cgi, or whatever, that will let you
make a somewhat custom graph on the fly. makegraph.cgi by itself will list
all the hosts and let you choose one. makegraph.cgi?host=xxx will list
all the RRDs for this host and let you choose ?one?.
makegraph.cgi?host=xxx\&rrd=yyy will list the various DSs for this RRD and
let you choose the ones you want. Then you get to define any CDEFs needed
and then LINEn/AREA/STACK for each DEF or CDEF desired. And size, title,
legends...
- On hold since graph.cgi (see the graph.cgi section) will let you get at any existing graph you want.
If I find a use or need for this, I'll re-activate it.
\textbf{92) 20000518 [HIGH]} collect traffic info from cflowd (artsportms).
Make it flexible enough that it can let you choose which ports you
want (one per rrd?). Make a loader to load historical data.
- [DONE 20000524] artsportms-loader done
- I no longer have access to devices with this feature
I've also kept the stuff that used to be here, but has already been
done (see the done section) .
%------------------------------------ faq.pod ---
\chapter{FAQ}
\index{FAQ}
\section{FAQ for remstats}%
\index{FAQ for remstats}
This is only a proto-FAQ, with stuff I couldn't figure out where to
document.
\begin{enumerate}
\item \textbf{snmp-collector complains that I don't have SNMP\_Session installed, but
I don't want SNMP. Do I have to collect and install it?}
No you don't. Modify the \texttt{collectors} line in the general (see the general section)
config-file to exclude snmp.
\item \textbf{I modified the RRD definition, adding a new RRA, but
remstats is ignoring it. How do I do this?}
Sorry. At the moment, remstats won't propagate any changes to the
RRD structure after creation. Some changes, like extending RRAs
and changing min/max for DSs can be done manually with rrdtool.
If you do use rrdtool manually, I recommend that you modify the
rrd definition as well, to keep them in sync. At some point in the
future, I'd like to try to do this kind of update, and it's more
likely to succeed if remstats' rrd definition matches what's in
the actual RRD.
\end{enumerate}%------------------------------------ conventions.pod ---
\section{Documentation Conventions}%
\index{Documentation Conventions}
The only documentation conventions the reader has to know about are:
\begin{itemize}
\item things inside [square brackets] are optional
\item parenthesized lists with the items separated by vertical bars,
(like | this | one) require that you choose one and only one of the
alternatives.
\end{itemize}
Everything else ought to be explicit. If it isn't, or if you don't
understand it, please bring it to the author's attention, stating
which part you don't understand. There's not a lot of point in my
writing documentation which no-one else can understand. I'd rather
do it right.
%------------------------------------ required.pod ---
\chapter{Prerequisites}
\index{Prerequisites}
\section{What you need to install remstats}%
\index{What you need to install remstats}
\begin{enumerate}
\item You'll need perl (see \textbf{http://www.perl.org/}), at least version 5.005\_03.
If you don't already have it you can get it from
CPAN (see \textbf{ftp://ftp.crc.ca/pub/packages/lang/perl/CPAN/src/stable.tar.gz})
(the Comprehensive Perl Archive Network).
\item You'll need a \texttt{C} compiler that works. :-) gcc or
egcs (see \textbf{ftp://ftp.crc.ca/pub/packages/egcs/}) will do fine and
you can find them easily in many different places.
\item Make sure you have the following perl modules installed
(most of which you can find at CPAN (see \textbf{http://www.cpan.org/modules/by-module})).
The versions are the versions I'm using, but more recent versions should work
too, unless there have been radical changes. They should be installed in
the listed order to avoid dependency problems:
\begin{itemize}
\item RRDs 1.0.16 (see \textbf{http:src/rrdtool-1.0.29.tar.gz})
- the key piece. It comes with RRDtool and does the database and graphs.
Originally from
http://ee-staff.ethz.ch/\~{}oetiker/webtools/rrdtool (see \textbf{http://ee-staff.ethz.ch/\~{}oetiker/webtools/rrdtool}).
\item Socket - should be standard if you've got the required perl
\item IO::Socket - should also be standard in the requires version of perl
\item Time::HiRes 01.19 (see \textbf{http:src/Time-HiRes-01.20.tar.gz})
- used by the port-collector to determine response time. You
can comment out the \texttt{"use Time::HiRes"} line and the program
will still work, but the response-time resolution will be one second instead
of one milli-second. Originally from
CPAN (see \textbf{ftp://ftp.crc.ca/pub/packages/lang/perl/CPAN/modules/by-module/Time}).
\item SNMP\_Session 0.69 (see \textbf{http:src/SNMP\_Session-0.77.tar.gz})
- used by the snmp-collector. If you don't need SNMP, you
can leave it out, but you'll have to change the config file. (Modify the
\texttt{collectors} line to leave out \texttt{snmp}.) Originally from
ftp://ftp.switch.ch/software/sources/network/snmp/perl/ (see \textbf{ftp://ftp.switch.ch/software/sources/network/snmp/perl/}).
\item GD (see \textbf{http:src/GD-1.30.pm.tar.gz}) - used only by dataimage.cgi (see the dataimage.cgi section) to create images on the fly.
Originally from http://stein.cshl.org/WWW/software/GD/GD.html (see \textbf{http://stein.cshl.org/WWW/software/GD/GD.html}).
\end{itemize}
\item You'll also need the following programs for the \texttt{unix-status-server}.
(You can change the locations at the top of it.) You almost certainly
have most of these and can ignore any that you don't tell the
\texttt{unix-status-collector} to query. For details, look in the
unix-status-server (see the unix-status-server section) docs:
\begin{verbatim}%
uname, vmstat, df, uptime, netstat, ps, ftpcount, qmail-qstat and
qmail-qread.\end{verbatim}
\end{itemize}
\end{enumerate}%------------------------------------ install.pod ---
\chapter{Installation}
\index{Installation}
\section{How to install remstats}%
\index{How to install remstats}
READ THE RELEASE NOTES (see the RELEASE NOTES section) FIRST. This page is generic and does
\textbf{NOT} include version-specific instructions.
I know that this is not simple. I do plan to make it simpler, but it'll \textbf{never}
be "\texttt{./configure; make; make install}" because I don't know what you want
to monitor.
The two C programs (multiping (see the multiping section) and traceroute (see the traceroute section) ) now use autoconf, and the
main configure script works (from the outside) simlarly to an autoconf-generated
configure. I haven't seen a need to convert it to autoconf yet.
It's mostly perl scripts and
if you have the right version of perl properly installed, it shouldn't need
anything special. The \texttt{unix-status-server} is a slight exception to this,
but the only configuration needed so far is done dynamically and is only the
location of the various required utilities.
\begin{enumerate}
\item Unpack the distribution tarball:
\begin{verbatim}%
gunzip -dc remstats.tar.gz | tar xf -\end{verbatim}
\item create the remstats user and group, if you haven't already,
(by default \texttt{remstats} and \texttt{remstats} respectively.) (See also
the remstats user (see the the remstats user section) .)
\item Build and install the software. If you're upgrading, you
might want to take a copy of fixup.config from the old version:
\begin{verbatim}%
sh configure\end{verbatim}
If you want to override the defaults, then run
\begin{verbatim}%
sh configure --help\end{verbatim}
for a list of what can be overridden.
[Check fixup.config to make sure it is properly setup.]
\begin{verbatim}%
make all
make install
su -c 'make install-suid'\end{verbatim}
\textbf{Note:} this step also customizes the programs and documentation
with your choice of directories, owner, ... so this documentation
should refer to your setup after you've done the install.
The \texttt{make install-suid} simply makes traceroute and multiping suid root.
They won't work most places unles run as root, one way or another. Since I
don't like to run all of remstats as root, this was the best compromise I
could come up with.
\item fix the config-base for site-specific things. Edit the following
files in /home/remstats/etc/config-base, looking the the string "FIXME", without the "quotes".
\begin{verbatim}%
alerts general html scripts/http-proxy\end{verbatim}
I'll try to keep this list up to date, but you can make sure by doing:
\begin{verbatim}%
grep -l FIXME /home/remstats/etc/config-base/* /home/remstats/etc/config-base/*/*\end{verbatim}
\item Make a config-dir (see the config-dir section) to describe what you
want to monitor. You can do this by hand, or using the configuration
building tools. To use the tools, you'll have to make a few files
listing various kinds of hosts:
\begin{verbatim}%
cd /home/remstats/etc
/home/remstats/bin/new-config config
/home/remstats/bin/new-ping-hosts groupname1 group1-hosts-file
/home/remstats/bin/new-ping-hosts groupname2 group2-hosts-file
...
/home/remstats/bin/new-port-hosts groupname3 port-hosts-file
/home/remstats/bin/new-snmp-hosts groupname4 SNMP-community-string snmp-hosts-file\end{verbatim}
After you've installed the unix-status-server (see the unix-status-server section) on some hosts, you can also
use:
\begin{verbatim}%
/home/remstats/bin/new-unix-hosts groupname5 unix-hosts-file\end{verbatim}
If you have any Windows NT hosts that you want to monitor, after you
have installed the nt-status-server (see the nt-status-server section) , you can run nt-discover (see the nt-discover section) to
find and add the NT hosts for a given NT domain.
If you're going to use the log-collector, you'll have to build the
rrd entries for each by hand. There doesn't seem to be much standard
in where log-files go, let alone what's in them.
\item Arrange for cron to run run-remstats (see the run-remstats section) at an appropriate interval.
For a five-minute interval, something like the following will do:
\begin{verbatim}%
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/remstats/bin/run-remstats\end{verbatim}
This checks the configuration, collects the new data, updates the rrds,
runs the monitors to compute statuses and updates the web-pages. Note: it
does \textbf{not} re-write the web-pages for every iteration; it only does so when
configuration files have changed, as the web-pages will show new data by
themselves.
\item [optional] Arrange for cron to run \texttt{do-traceroutes} at an appropriate
interval. You could run it in the wee hours of each morning like:
\begin{verbatim}%
5 3 * * * /home/remstats/bin/do-traceroutes\end{verbatim}
This information isn't currently used, but I'm planning to make use of it.
\item [optional] Arrange for cron to run snmpif-description-updater (see the snmpif-description-updater section)
periodically, if you have any snmpif-* RRDs, which you're likely to change
the descriptions on. Say every day, like:
\begin{verbatim}%
0 3 * * * /home/remstats/bin/snmpif-description-updater\end{verbatim}
\item Arrange for cron to run cleanup (see the cleanup section) every now and then to remove old
un-needed files, like:
\begin{verbatim}%
0 3 * * * /home/remstats/bin/cleanup\end{verbatim}
This removes no-longer-needed files, like old host graphs, traceroute results,
log-files, ...
\item You'll need to set up your web-server (see the web-server section) to allow
CGI scripts in the remstats html tree and make sure that you're not
allowing everyone in.
\item Make a symlink in the html directory from whichever index
you prefer to index.cgi.
\item You'll want to look at the server installation docs (see the server installation docs section)
if you're going to be running any of the remote servers (
log-server (see the log-server section) , nt-status-server (see the nt-status-server section) , remoteping-server (see the remoteping-server section) , and
unix-status-server (see the unix-status-server section) ).
\end{enumerate}
Enjoy your pretty pictures and I hope that you find them usefull.
%------------------------------------ install-user.pod ---
\section{The remstats user}%
\index{The remstats user}
You \textbf{must} choose a userid to run the remstats processes under.
By default, it will be the user \texttt{remstats}, but you'll have to
create it manually, as I'm not going to risk damaging someone's
/etc/passwd file. Many operating-systems have a script called
\texttt{useradd} or \texttt{adduser} or some variant on that.
\textbf{NOTE:} Don't run the remstats programs except as the remstats users.
Many of the programs write extra files you won't know about unless
you read the source, and when you do run them as the remstats user,
it won't be able to modify the files that were created by the other
user. This will probably cause the program to die, with a meaningful
error message I hope, and you'll have to modify the owner by hand, as
root. If you need to do this, go back to the source directory and
do:
\begin{verbatim}%
\% su -c 'make install-owner'\end{verbatim}
The remstats user must be able to write files within the remstats
directory trees rooted at /home/remstats, /home/remstats/data and /home/remstats/html.
The collection/update processes will also create files under
/home/remstats/tmp and /home/remstats/data. The pagemakers write files under
/home/remstats/html. It's simplest to have all the remstats files and
directories(except multiping (see the multiping section) and traceroute (see the traceroute section) ) owned by
the remstats user.
You must also ensure that the CGI scripts (and almost every web-page
remstats creates is a CGI script) get run by the remstats user. The
CGI scripts read files under /home/remstats/data and /home/remstats/datapage.
(See also the web-server installation (see the the web-server installation section) .
%------------------------------------ cookies.pod ---
\section{Magic Cookies}%
\index{Magic Cookies}
There are various places in the configuration file where you can have
text substituted for you. It's a (very-limited) macro facility. Currently,
it only works in graphs, scripts and tools. The cookies are always UPPERCASE and
the name is surrounded by "\#\#", so that a request for "color1" would look like
"\#\#COLOR1\#\#", without the quotes.
Here they are:
\subsection{Colours: }%
\index{Colours: }
You're not required to use these, but you'll really regret not doing so
when you decide to change colors later.
\begin{itemize}
\item COLOR1, COLOR2, ... COLOR6 and also COLOR1a, ...
- generic colours for graphs
\item PROBLEMCOLOR - an alarming colour
\item TOTALCOLOR - the full amount of something
\item USEDCOLOR - how much is used of something
\end{itemize}
\subsection{Other stuff:}%
\index{Other stuff:}
\begin{itemize}
\item DB - the full path and file-name of the current rrd
\item DATADIR - where the data files live
\item GRAPH - the name of the current graph
\item GRAPHTIME - the name of the current time-span
\item HOST - the name of this host
\item IP - the IP number of the host
\item IPORHOST - the IP number of the host, if it's defined in the host's
config-file, or else its name.
\item HTMLURL - where the html and graphs live in web-space
\item RRD - the name of the current rrd, without the .rrd extension or
file-name fixing
\item SHORTHOST - the name of the host before the first dot,
unless it is a generic name like www or ftp or mail
\item THUMBHEIGHT, THUMBWIDTH - how large to ask for the thumbnail
to be. Not how large the resulting gif is.
\item WEBMASTER - who is responsible for this web presense
\item WILDPART - the instance part of a wild rrd. If if-*
is being used by if-le0, then WILDPART will be le0.
\end{itemize}%------------------------------------ private.pod ---
\section{private.pl - configuration-supplied functions}%
\index{private.pl - configuration-supplied functions}
There are several places where you can have functions you choose invoked
by remstats:
\begin{itemize}
\item updater (see the updater section) permits functions to be applied to incoming data via the
rrd definition (see the rrd definition section) .
\item datapage.cgi (see the datapage.cgi section) and
dataimage.cgi (see the dataimage.cgi section) permit functions to be applied on
any \texttt{eval} line.
\end{itemize}
Rather than me continually adding functions or you having to hand-modify
your copy of remstats whenever I make new releases, I've supplied an
almost empty perl file called \texttt{private.pl} which will be installed in the
/home/remstats/lib if you don't have one, but will never be modified by me after
that.
%------------------------------------ install-servers.pod ---
\section{Server Installation}%
\index{Server Installation}
One of the interesting things about remstats (I think), is
the remote servers (see the remote servers section) . To install them, you'll need to do a few
things on each host which will run the servers:
\begin{itemize}
\item Add entries for the servers in \texttt{/etc/services}, like
this:
\begin{verbatim}%
unix-status 1957/tcp \# remstats unix-status server
log-server 1958/tcp \# remstats log server\end{verbatim}
You can run them on different ports, but these are the
defaults and you'd have to change run-remstats (see the run-remstats section)
to add the appropriate switches.
\item [Optional] Unless you're going to run the servers as
root (unnecessary), you'll need to create the user that
the servers will run as.
The only reasons I can think of for running the servers as
root are if you need to run them on a low-numbered port
($<$=1024), or if you need to read a log-file which isn't
readable by the remstats user, or if you want to run
multiping non-suid. On Linux and Solaris, you can do:
\begin{verbatim}%
groupadd remstats
useradd -g remstats -d /home/remstats remstats\end{verbatim}
\item Modify \texttt{/etc/inetd.conf} to get the servers invoked,
like this:
\begin{verbatim}%
unix-status stream tcp nowait remstats /home/remstats/unix-status-server unix-status-server
log-server stream tcp nowait remstats /home/remstats/log-server log-server logfile1 logfile2\end{verbatim}
Or if you're using
tcp\_wrappers (see \textbf{ftp://ftp.porcupine.org/pub/security/tcp\_wrappers\_7.6.BLURB}),
which you should be:
\begin{verbatim}%
unix-status stream tcp nowait remstats /path/to/tcpd /home/remstats/unix-status-server
log-server stream tcp nowait remstats /path/to/tcpd /home/remstats/log-server logfile1 logfile2\end{verbatim}
And remember to update \texttt{/etc/hosts.allow} to allow your remstats host access.
\item Tell inetd to re-read it's config-file:
\begin{verbatim}%
kill -HUP pid-of-inetd\end{verbatim}
\item copy the remstats servers to the machines which will run them
\begin{verbatim}%
rcp unix-status-server log-server remoteping-server multiping host:/home/remstats\end{verbatim}
\end{itemize}
\section{The nt-status-server}%
\index{The nt-status-server}
This one is a bit different to install. I've only done it under the
ActiveState Perl (see \textbf{http://www.activestate.com/Products/ActivePerl/index.html}) under
Windows NT 4.0. Installing the ActiveState Perl is straightforward; if you got here,
you'll have no trouble with that. Installing it as a service is not as simple as I
intended. You'll have to get the SRVANY and INSTSRV programs from the NT Resource Kit,
and follow their instructions. The program that SRVANY will be running is, of course,
perl (usually \texttt{C:$\backslash$Perl$\backslash$bin$\backslash$perl.exe}), and the argument string something like:
\begin{verbatim}%
c:$\backslash$wherever$\backslash$you$\backslash$put$\backslash$nt-status-server -s -t10.111.12.13\end{verbatim}
You'll have to replace \texttt{c:$\backslash$wherever$\backslash$you$\backslash$put} with the path to nt-status-server (see the nt-status-server section) ,
and \texttt{10.111.12.13} with the IP number of the host running the nt-status-collector (see the nt-status-collector section) .
%------------------------------------ install-webserver.pod ---
\section{Getting your web-server ready for remstats}%
\index{Getting your web-server ready for remstats}
\subsection{Choosing userid for remstats}%
\index{Choosing userid for remstats}
Almost all the remstats web-pages are generated by some kind of CGI
script. Many of them will read additional files not available under the
html directory tree. In order to provide access to these files, you need
to make sure that the scripts get run as the remstats user. The simplest
way to do this is to run a separate instance of the web-server software
as the remstats user. You may have other methods of accomplishing this,
depending on the web-server you're using. (See also
remstats user (see the remstats user section) .)
\subsection{Running CGI scripts under the remstats tree}%
\index{Running CGI scripts under the remstats tree}
You also may need to tell your web-server that \texttt{xxx.cgi} means that this
file is a CGI script and needs to be run, instead of just displayed. With
the apache (see \textbf{http://httpd.apache.org}) web-server, you could add the following
lines to the \texttt{httpd.conf} file:
\begin{verbatim}%
$<$Directory /home/remstats/html$>$
Options FollowSymlinks ExecCGI
AddHandler cgi-script .cgi
$<$/Directory$>$\end{verbatim}
\subsection{Restricting access to CGI scripts}%
\index{Restricting access to CGI scripts}
There are a few things you should do before telling others about remstats.
Remstats comes with a few CGI scripts which you probably don't want to make
publicly available and two that you certainly don't. \texttt{ping.cgi},
\texttt{traceroute.cgi} and \texttt{whois.cgi} should probably be restricted to your
own organization, unless you don't mind letting anyone on the Internet run
pings, traceroutes and whois queries from your domain. Rectricted to your
domain, you only have to worry about your own people.
However, \texttt{alert.cgi} and \texttt{log-event.cgi} are a different kettle of fish.
They will permit anyone who can run it to quench alerts and log comments
about them. You will probably want to be a bit more restrictive about
who you let run this.
Using the apache (see \textbf{http://httpd.apache.org/}) web-server, you can restrict
the use of these CGIs using a \texttt{.htaccess} file something like this:
\begin{verbatim}%
\# Note that this example uses the private network 192.168.0.0.
\# Stuff to make Apache expire the files to get them refreshed
ExpiresActive on
\# images every 5 minutes, when the data gets updated
ExpiresByType image/gif M300
ExpiresByType image/png M300
\# html every day
ExpiresByType text/html M300\end{verbatim}
\begin{verbatim}%
\# What to allow
Options ExecCGI FollowSymlinks Indexes\end{verbatim}
\begin{verbatim}%
$<$Files "\^(whois.cgi|traceroute.cgi|ping.cgi)\$"$>$
order deny,allow
deny from all
allow from 192.168. 127.0.0.1
$<$/Files$>$\end{verbatim}
\begin{verbatim}%
$<$Files "\^(alert.cgi|log-event.cgi)\$"$>$
order deny,allow
deny from all
allow from 192.168.20.1 192.168.23.3
$<$/Files$>$\end{verbatim}
\begin{verbatim}%
\# How they're allowed in
order deny,allow
allow from all\end{verbatim}
I won't claim the IP\#-based access-control is completely safe, but it's
easy and keeps out casual browsers. If you \textbf{really} need to keep
this information safe, use a secure web-server, say apache with mod\_ssl.
If that's not good enough, you ought to consider whether this stuff
really belongs on a network at all.
%------------------------------------ configuration.pod ---
\chapter{The Configuration Directory}
\index{The Configuration Directory}
The run-time configuration of remstats is done through a directory-tree of files.
The current tree structure is:
\begin{verbatim}%
configdir
+--- alerts (see the alerts section)
+--- alert-destination-map (see the alert-destination-map section)
+--- alert-template-map (see the alert-template-map section)
+--- alert-templates (see the alert-templates section)
| +--- template1
| +--- template2
| ...
+--- archives (see the archives section)
+--- availability (see the availability section)
+--- colors (see the colors section)
+--- customgraphs (see the customgraphs section)
| +--- graph1
| +--- graph2
| ...
+--- datapages (see the datapages section)
| +--- datapage1.page
| +--- datapage2.page
| ...
+--- general (see the general section)
+--- groups (see the groups section)
+--- hosts (see the hosts section)
| +--- host1
| +--- host2
| ...
+--- host-templates (see the host-templates section)
| +--- host-template1
| +--- host-template2
| ...
+--- html (see the html section)
+--- links (see the links section)
+--- oids (see the oids section)
+--- remotepings (see the remotepings section)
+--- rrds (see the rrds section)
| +--- rrd1
| +--- rrd2
| ...
+--- scripts (see the scripts section)
| +--- script1
| +--- script2
| ...
+--- times (see the times section)
+--- tools (see the tools section)
+--- views (see the views section)
| +--- view1
| +--- view1
| ...
+--- view-templates (see the view-templates section)
+--- template1
+--- template2
...\end{verbatim}
(You can look at the base configuration directory if
you want, but you should also read through this so you know the
significance of what you see.
Almost all the configuration files allow both blank lines and comment-lines.
A comment-line \textbf{begins} with a \textbf{\#} and the whole line is ignored by
remstats. Inline comments are \textbf{not} permitted as the '\textbf{\#}' is used
in some places for other purposes. The only files which don't permit comments
are the \texttt{view-templates}, which are html, and the last part of \texttt{datapages},
which are also html.
\texttt{alert-templates}, \texttt{customgraphs}, \texttt{datapages}, \texttt{scripts}, \texttt{rrds}, \texttt{hosts},
\texttt{host-templates}, \texttt{views} and
\texttt{view-templates} are sub-directories with the files within describing one
of that kind of entity. E.G. a file in the \texttt{hosts} sub-directory is
named for the host and contains that host's configuration.
\textbf{NOTE}: within the sub-directories, files with names beginning with 'IGNORE-'
or ending with '\~{}', will be ignored.
There are also a few tools (see the tools section) to help
you make and update your config-file, although not all parts of it.
%------------------------------------ configfile-alerts.pod ---
\section{Configuration - Alerts}%
\index{Configuration - Alerts}
The alerts config-file is used by the alert-monitor (see the alert-monitor section) to decide who to
send alerts for problems. The rrds (see the rrds section) and hosts (see the hosts section)
config-files decide \texttt{when} an alert needs to be raised, and these lines tell \texttt{who}
gets the alert.
Each line is in seven parts, most of which are patterns, e.g.:
\begin{verbatim}%
warn * MISC UPTIME 0 0 uptime-alerts
error silverlock.dgim.crc.ca * * 600 900 test-alerts
error news.crc.ca * * 600 900 news-alerts
critical * * * 600 900 critical-alerts\end{verbatim}
The first "word" is the status, as decided by the alert-monitor (see the alert-monitor section) .
The second word is a regex to match against the hostname. The third is
a regex to match against the rrdname. The fourth is a regex for the
variablename. The fifth is the minimum time for the alert condition
to be present before an alert can be triggered. The sixth is the interval
after sending an alert before another will be sent. The seventh is
an \texttt{alert-destination} as specified in the
alert-destination-map (see the alert-destination-map section) .
Note: The seventh used to be an alert program and there was an eighth which was
an address, of a form appropriate to the alert program. This has been
rolled into the \texttt{alert-destination-map} to make it more flexible.
If the current condition matches the status, host, rrd, and variable,
then alert-monitor has to look at the times. If this is a new
condition (i.e. it was in OK status previously), then an alert won't
be triggered until after the minimum time has passed. This avoids transient
problems being reported. If you want these to be reported, then set it to zero.
If this is an old alert, then an alert won't be
triggered until the interval time has passed since the previous alert.
If the interval is 0 (zero) then there will only be one alert at the
start-time.
%------------------------------------ configfile-alert-destination-map.pod ---
\section{Configuration - alert-destination-map}%
\index{Configuration - alert-destination-map}
This config-file specifies the mapping from an abstract alert destination,
specified in the alerts config-file (see the alerts config-file section) , and the address(es) to
send it to. The alerter (see the alerter section) attempts to match the abstract alert destination
against each of the \texttt{map} lines and sends alerts to any which match.
There are three kinds of lines in this config-file: \texttt{map}, \texttt{alias}, and \texttt{method}.
\texttt{Map} lines map from an abstract alert destination, listed in the alerts
config-file, to a less abstract alias, listed here. The \texttt{alias} lines allow a
crude list capability and also permit the use of different methods to deliver the
alert. \texttt{Method} lines tell what program to run with what arguments in order to
deliver to that type of address.
\subsection{Map Lines }%
\index{Map Lines }
A map line looks like:
\begin{verbatim}%
map DEST TIME DOW DOM MON ALIAS\end{verbatim}
Where:
\begin{itemize}
\item DEST is an abstract alert destination, listed in the alerts config-file
\item TIME is a time-of-day specification, comma-separated time-ranges or '*'
meaning all times. A range looks like HHMM-HHMM.
\item DOW is a day-of-the-week spec, a comma-separated list of weekdays, in
numeric form (0=sunday, 1=monday, ...) or '*' for all weekdays.
\item DOM is a day-of-the-month spec. It's a comma-separated list of day-ranges,
where a range is a day or DD-DD, or '*' ofr all days.
\item MON is a month spec. It's a comma-separated list of month-ranges, in
numeric form, like MM or MM-MM or '*' for all months.
\item ALIAS is the alias that this DESTination maps to during the specified
time-period. It's defined in an alias line.
\end{itemize}
This permits different DESTinations to be sent to different people at different times,
depending on who's on duty.
\subsection{Alias Lines}%
\index{Alias Lines}
An alias line looks like:
\begin{verbatim}%
alias ALIAS METHOD:ADDR ...\end{verbatim}
Where:
\begin{itemize}
\item ALIAS is the alias being defined
\item METHOD is an alert-delivery method (see methods below)
\item ADDR is an address which is valid for that method
\end{itemize}
This indirection permits delivery of the same alert via multiple methods, in case
one or more of the methods isn't available, as well as to different people.
\subsection{Method Lines}%
\index{Method Lines}
A method line looks like:
\begin{verbatim}%
method METHOD COMMAND-LINE\end{verbatim}
Where:
\begin{itemize}
\item METHOD is the method being defined
\item COMMAND-LINE is the program to run with any arguments it requires.
It will be passed the alert message on stdin and the address
to send it to at the end of the COMMAND-LINE.
\end{itemize}
\subsection{An Example}%
\index{An Example}
We have three guys on different shifts who manage network operations (Tom, Dick and Harry)
during the week. On the week-end Frank is on call. We also want to email a copy to
an email address which collects all the alerts.
We want to send the alerts to whoever is working at the time. Say the abstract destination
specified in the alerts config-file (see the alerts config-file section) is \texttt{alerts}. We might use lines
like those below:
\begin{verbatim}%
map alerts 0600-1359 1,2,3,4,5 * * tom
map alerts 1400-2159 1,2,3,4,5 * * dick
map alerts 0000-0559,2200-2359 1,2,3,4,5 * * harry
map alerts * 0,6 * * frank\end{verbatim}
\begin{verbatim}%
alias tom email:tom@our.com email:alert-history@our.com winpopup:console
alias dick email:dick@our.com email:alert-history@our.com winpopup:console
alias harry email:harry@our.com email:alert-history@our.com winpopup:console
alias frank page:555-1234 email:alert-history@our.com winpopup:console\end{verbatim}
\begin{verbatim}%
method email /home/remstats/bin/alert-email
method winpopup /home/remstats/bin/alert-winpopup
method page /home/remstats/binalert-page\end{verbatim}
Note: the hypothetical \texttt{page} method isn't provided with remstats. There are lots of different
programs to send pages. Look at alerter (see the alerter section) if you want to add your own methods; it's easy.
%------------------------------------ configfile-alert-template-map.pod ---
\section{Configuration - alert-template-map}%
\index{Configuration - alert-template-map}
The \texttt{alert-template-map} tells which
alert-template (see the alert-template section) to use for which addressee or
RRD. The lines look like:
\begin{verbatim}%
address regex template\end{verbatim}
or
\begin{verbatim}%
rrd rrd template\end{verbatim}
or
\begin{verbatim}%
rrd rrd:variable template\end{verbatim}
Addresses are checked first. This is to allow special mapping for devices
like pagers which can't display a lot of information. If none of the
special addresses match, then RRDs are checked, first with variables
then without. An RRD can be an RRD instance, like \texttt{port-ftp}, or the
wild RRD, e.g. \texttt{port-*}.
The \texttt{template} is the name of the template file in the \texttt{alert-templates}
config-dir.
%------------------------------------ configfile-alert-templates.pod ---
\section{Configuration - alert-templates}%
\index{Configuration - alert-templates}
The \texttt{alert-templates} directory contains the alert message templates. They
are just text files with cookies (see the cookies section) in them. The cookies available are
slightly different than the standard list, but they work the same way. You
put \#\#COOKIENAME\#\# wherever you want to see the value of the 'cookiename'
variable. The ones available for alerts are:
\begin{itemize}
\item HOST - the host for the alert
\item REALRRD - the RRD instance for the alert
\item FIXEDRRD - the RRD instance with the character-set translated a bit
for file-names and message-id's
\item VAR - the variable name
\item STATUS - the alert status (OK, WARN, ERROR, CRITICAL)
\item VALUE - the value of the variable that caused this alert
\item RELATION and THRESHOLD - the alert is triggered when the VALUE is
no longer in RELATION to the THRESHOLD value.
\item START - when the alert was first noticed
\item DURATION - how long the alert has been in this STATUS
\item HOSTDESC - the description line for this host
\item RRDDESC - the description for this instance of the RRD
\item NOW - the current time as a unix timestamp
\item NOWTEXT - the current time for email headers
\item ALERTHOST - the hostname of the host sending the alert
\item TOWHO - the addressee for this alert
\end{itemize}
There are three special files in the \texttt{alert-templates} directory,
which \textbf{must} exist:
\begin{itemize}
\item DEFAULT - which contains the default template to be used when no
other matches the alert-template-map (see the alert-template-map section) .
\item HEADERS - which supplies the headers for each message, with
the same substitutions as the rest of the template files. Make very sure
that the HEADERS file ends with or contains an empty line or your
message will be interpreted as part of the headers and will
undoubtedly look wrong. The alert-email (see the alert-email section) script does not check this.
\item FOOTER - supplies a standard ending for each message.
\end{itemize}%------------------------------------ configfile-archives.pod ---
\section{Configuration - Archives}%
\index{Configuration - Archives}
The archives file names various data-retention periods. RRDtool|http://ee-staff.ethz.ch/\~{}oetiker/webtools/rrdtool
calls them RRAs. Each line is in two pieces: an archive name and an RRA
specification, exactly as documented in the rrdcreate manpage. Unfortunately,
modifying this after the rrd has been created isn't one of the things that RRDtool
does. I've got an rrd munger on my L$<$to-do list|todo (see the RRDtool|http://ee-staff.ethz.ch/\~{}oetiker/webtools/rrdtool
calls them RRAs. Each line is in two pieces: an archive name and an RRA
specification, exactly as documented in the rrdcreate manpage. Unfortunately,
modifying this after the rrd has been created isn't one of the things that RRDtool
does. I've got an rrd munger on my L$<$to-do list|todo section) , but it's still
not done.
For example:
\begin{verbatim}%
day-avg AVERAGE:0.1:1:600
week-avg AVERAGE:0.1:7:300
month-avg AVERAGE:0.1:30:300
3month-avg AVERAGE:0.1:90:300
year-avg AVERAGE:0.1:365:300\end{verbatim}
\begin{verbatim}%
day-min MIN:0.1:1:600
week-min MIN:0.1:7:300
month-min MIN:0.1:30:300
3month-min MIN:0.1:90:300
year-min MIN:0.1:365:300\end{verbatim}
%------------------------------------ configfile-availability.pod ---
\section{Configuration - Availability}%
\index{Configuration - Availability}
There are two types of availability definitions: for an RRD or for an
RRD on a particular host. The RRD may also be a wildcard RRD, like
"df-*" or an instance of an RRD, like "df-/home". The definitions
look like:
\begin{verbatim}%
rrd RRDNAME VARNAME CF RELATION THRESHOLD
and
host HOSTNAME RRDNAME VARNAME CF RELATION THRESHOLD\end{verbatim}
The CF is one of LAST, MIN, MAX or AVERAGE, with rrdtool's usual
meaning. The RELATION can be any one of: $<$, $<$=, $>$, $>$=, or =.
The THRESHOLD is the number to which the value of VARNAME must be
in the correct RELATION. (Clear as mud.)
As an example, take the following definition:
\begin{verbatim}%
rrd ping rcvd MINIMUM $>$ 0\end{verbatim}
This means that the variable 'rcvd' in the ping rrd must be greater
than zero for it to be considered "available". All time intervals
where it isn't, or for which no data is available, are considered
"unavailable".
There are also two other record types: colors and thresholds. A colors
record looks like:
\begin{verbatim}%
colors COLOR1 ...\end{verbatim}
A thresholds line looks like:
\begin{verbatim}%
thresholds NUMBER ...\end{verbatim}
and must have the same number of values as the colors line. Only one of
each. Here's an example to make the use clear (I hope):
\begin{verbatim}%
colors avail1 avail2 avail3 avail4
thresholds 99 98 95 90\end{verbatim}
The \texttt{colors} line above requires that the colors 'avail1', ... be defined in the
colors (see the colors section) config-file. The \texttt{thresholds} line above specifies
that if an availability is 99\% or above, it should be colored 'avail1' color,
98\% to 99\%, use 'avail2' color, etc.
%------------------------------------ configfile-colors.pod ---
\section{Configuration - Colors}%
\index{Configuration - Colors}
The \texttt{colors} config-file just gives names to colors so you don't have
to remember or decipher hex numbers. Each line is: a colour
name and a six-digit hex number giving the RGB values. The color
"down" is special and is used in the ping index to color the
background of hosts which are down.
For example:
\begin{verbatim}%
totalcolor 00a000
usedcolor a00000
downcolor a07777\end{verbatim}
%------------------------------------ configfile-customgraphs.pod ---
\section{Configuration - Customgraphs}%
\index{Configuration - Customgraphs}
The \texttt{customgraphs} directory contains files which define graphs which
aren't associated with a specific host. They are very like a graph
defined on an rrd (see the rrd section) . They do have a times
line preceeding the graph definition, to set the time-periods for that
graph. They are linked in under the \texttt{Custom Index}.
Note that the graph definition itself must be indented, like rrd graphs.
As an example:
\texttt{times day yesterday week 3month year}
\begin{verbatim}%
--upper-limit 100 --lower-limit 0 --rigid
--vertical-label 'CPU \%'
--title 'CPU Usage (\#\#GRAPHTIME\#\#)'
DEF:silverlockuser=\#\#DATADIR\#\#/silverlock.dgim.crc.ca/cpu.rrd:user:AVERAGE
DEF:silverlocksys=\#\#DATADIR\#\#/silverlock.dgim.crc.ca/cpu.rrd:system:AVERAGE
DEF:loisuser=\#\#DATADIR\#\#/lois.dgim.crc.ca/cpu.rrd:user:AVERAGE
DEF:loissys=\#\#DATADIR\#\#/lois.dgim.crc.ca/cpu.rrd:system:AVERAGE
CDEF:silverlock=silverlockuser,silverlocksys,+
CDEF:lois=loisuser,loissys,+
'LINE2:silverlock\#\#\#COLOR1\#\#:silverlock'
'LINE2:lois\#\#\#COLOR2\#\#:lois'\end{verbatim}
%------------------------------------ configfile-general.pod ---
\section{Configuration - General}%
\index{Configuration - General}
This is the miscellaneous config-file, but there are some critical pieces here:
\begin{itemize}
\item \texttt{datadir} (REQUIRED) -
The data for a given host is stored under \texttt{datadir/hostname}.
There are also other status files stored in this directory.
\item \texttt{staletime} (UNUSED) -
How long before we count a status as stale. (seconds)
\item \texttt{minuptime} -
How long a host must be up before it stops being flagged as recently up.
(seconds)
\item \texttt{keepalerts} (UNUSED) -
How long to keep records of alerts after the condition no longer exists.
(seconds)
\item \texttt{uptimealert} -
If this is set, the alert-monitor (see the alert-monitor section) will cause a warning level condition on the
fake rrd MISC for the fake variable UPTIME, for any host whose uptime
is less than this value. Whether this will trigger an alert depends on
the alerts (see the alerts section) file.
\item \texttt{pinger} -
If defined, this names the ping-collector (see the ping-collector section) to be used before all the
other collectors (see the collectors section) . (Unless you write your own, you put \texttt{ping-collector}
here.) If you don't include this line, then you'll want to make sure
the the \texttt{ping-collector} is listed in the \texttt{collectors} line (below).
\item \texttt{collectors} -
This line tells run-remstats (see the run-remstats section) which collectors to run. The default
list is all of them, so you can gain some benefit by pareing this line
down to those you are using, but remember it if you add new
rrds (see the rrds section) that need other collectors later. \textbf{Note:} you
list the names of the collectors without the '\texttt{-collector}' on the
end. E.G. the \texttt{ping-collector} would be included as just '\texttt{ping}'.
\item \texttt{monitors} -
This line tells run-remstats (see the run-remstats section) which monitors (see the monitors section) to run. The default is
all of them.
\item \texttt{pagemakers} -
This tells run-remstats (see the run-remstats section) which pagemakers (see the pagemakers section) to run at the end, if the
config-dir has changed. The default is all of them.
\item \texttt{max-port-patterns} -
This tells the port-collector (see the port-collector section) how many parenthesised patterns there can
be, at most, in \texttt{valuepattern}s or \texttt{infopattern}s. The default is 10.
\item \texttt{watchdogtimer} -
This sets the limit that run-remstats (see the run-remstats section) will apply to each of the programs
that it runs, so that, e.g., a hanging collector will not hang the whole
remstats cycle.
\item \texttt{keeplogs} -
This tells how long cleanup (see the cleanup section) will permit old files to hang around, in seconds.
\end{itemize}%------------------------------------ configfile-groups.pod ---
\section{Configuration - Groups}%
\index{Configuration - Groups}
If any name has spaces, it must be 'quoted'
or "quoted". Any groups not listed here \textsl{will not be linked into the HTML
index pages generated by html-writer}, but pages will still be created for them.
If there is a file in the \texttt{htmldir} with the same name as a group-name, but
with the spaces replaced by '\_' and all the letters lower-case, followed by
'.html', then the group-names in html pages will be linked to those pages.
E.G. for a group named \texttt{"Local Routers"}, if there is a file called
\texttt{local\_routers.html}, the name of the group will be made into a link to that
file.
%------------------------------------ configfile-hosts.pod ---
\section{Configuration - Hosts}%
\index{Configuration - Hosts}
The \texttt{hosts} files are what the whole configuration has been working
toward. Here we tell which hosts we're interested in and what we want to
monitor. Here's a sample host file called \texttt{clark.dgim.crc.ca}:
\begin{verbatim}%
desc DNS and Web
ip 142.92.39.18
aliases ns1.crc.ca
via 142.92.32.10
group Servers
contact Thomas Erskine $<$thomas.erskine@crc.ca$>$
tools ping traceroute telnet http clark-special:special
rrd ping
rrd cpu
noalert cpu user
community xyzzy
rrd load
nograph load users
rrd if-le0
alert if-le0 ierr $<$ 1000 5000 10000
alert if-le0 in WARN
rrd df-/var
rrd df-/tmp
rrd port-http critical
rrd port-ssh
rrd port-whois
noavailability port-whois status
noavailability port-whois response
rrd port-domain critical\end{verbatim}
The name of the file (\texttt{clark.dgim.crc.ca]}) is the host that you're
interested in. The name should be a fully-qualified-domain-name, but anything
which perl's getaddrbyname can resolve should work.
The \texttt{ip} line saves the IP number from having to be looked up and
could be used to deal with hosts which aren't in the DNS. If you want the
IP number to be looked up each time, you can leave this line out.
The \texttt{desc} line gives this host a description graph-writer (see the graph-writer section)
will put on pages about this host.
The \texttt{alias} line tells remstats about other names for this host. This
is mainly for the \texttt{ping-collector} to allow it to tell for sure when
it has got a response from this host.
The \texttt{via} line is used by the topology-monitor (see the topology-monitor section) to specify networking
gear (like hubs and switches) which are in the path to the host, but won't
show up in a traceroute.
The \texttt{group} line is required and tells which group this host
belongs to. Remember, you defined all the groups back in the
general (see the general section) file?
The \texttt{contact} line tells who to contact for this host. If a line in
the alerts config-file (see the alerts config-file section) refers to a
recipient called \texttt{CONTACT}, the value of the host's contact line
will be substituted.
The \texttt{tools} line tells which tools (defined in the
tools config-file (see the tools config-file section) )
you want to appear for this host. E.G. if a host doesn't have a
web-server, there's no point in providing a link to connect to it.
To accomodate host-specific tools, a toolname can be given as
\texttt{real-tool-name:display-name}. This means that the tool will be defined
in the \texttt{tools} config-file as \texttt{real-tool-name}, but will be displayed
as \texttt{display-name}.
The rrd (see the rrd section) lines tell which rrds to collect for this host.
If the rrd was defined as a wildcard, it will have the instance
specified here. In the example there are three wildcard lines, referring
to \texttt{if-le0}, \texttt{df-/var} and \texttt{df-/mail}. The first is looking at the data
for network interface hme0 and the others are getting data on the /var and
/mail file-systems, respectively.
The first \texttt{alert} line is setting the alert threshold for \texttt{if-le0}
to 50. If this host file was from the same configuration as the previous
\texttt{rrd} sample, the alert here would override the one in the \texttt{rrd} file.
There is also a \texttt{noalert} line, which cancels an alert set in the
rrd without setting a replacement alert. The alert line for a host
must specify the rrd as well, but is otherwise the same as an alert on an
rrd.
The second \texttt{alert} line is specifying the status (\texttt{WARN}) for missing data
for the \texttt{in} variable.
There can also be descriptions for rrds. If you append to an
\texttt{rrd} line something like \texttt{desc='xyzzy'}, then you'll see
that description on pages dealing with it. I added this for labelling
network interfaces, but you can use if for anything you want.
The \texttt{community} specifies the SNMP community string to use for this
host to fectch SNMP data. If the host config-file doesn't specify any RRDs
collected by the snmp-collector, you don't need to specify a community.
If this host uses any rrds collected by the snmp-collector, it can also
specify a port to use like:
\begin{verbatim}%
snmpport 3401\end{verbatim}
If the RRD itself specifies a port, then the RRD-specified port will be
used instead, for that RRD.
If you don't want a particular graph for this host, you can include a
\texttt{nograph} line. It looks like:
\begin{verbatim}%
nograph rrdname graphname\end{verbatim}
There can also be a \texttt{statusfile} line, looking like:
\begin{verbatim}%
statusfile NNN\end{verbatim}
with \texttt{NNN} replaced by the name of a status file from that hosts's data directory.
This permits the main index pages to show the status of an un-pingable host as the
status of something else, like the reachability of it's web-server (STATUS-port-http).
The \texttt{noavailability} lines tell the availability-report (see the availability-report section) program not to report on
certain rrd/variable combinations. In this case, we don't want to see availability stats
on the whois server. Maybe it's too embarassing?
%------------------------------------ configfile-host-templates.pod ---
\section{Configuration - Host Templates}%
\index{Configuration - Host Templates}
These config-files simply hold the same kind of lines as a
host config-file (see the host config-file section) . By adding a line like:
\begin{verbatim}%
template some-host-template\end{verbatim}
to a host config-file, you achieve the same effect as adding all
the lines contained within the template file. If you have many
hosts which are similar, this can be a usefull way of keeping the
configuration consistent.
It can also be used to parameterize things. As an example, if you
are using the nt-status-server (see the nt-status-server section) and are only running it on a single
host which is providing information on various other NT hosts, you
might make a template, say \texttt{default-nt-status-server} like:
\begin{verbatim}%
nt-status-server my.nt.status.server\end{verbatim}
and replace the \texttt{nt-status-server} lines in those hosts with:
\begin{verbatim}%
template default-nt-status-server\end{verbatim}
Then if you want to change which machine is running the \texttt{nt-status-server},
you'd just have to change the template.
%------------------------------------ configfile-html.pod ---
\section{Configuration - HTML}%
\index{Configuration - HTML}
The \texttt{html} file defines stuff related to web-page generation. There
are several different kinds of information.
\subsection{Locations }%
\index{Locations }
These things define where things are, like URLS. They are:
\begin{itemize}
\item \texttt{htmldir} (REQUIRED) -
The html stuff for a given host is stored under \texttt{htmldir/hostname}.
\item \texttt{htmlurl} (REQUIRED) -
How to refer to the \texttt{htmldir} in a URL.
\item \texttt{viewdir} -
Where to store the views, in case you don't want them under the \texttt{htmldir}.
\item \texttt{viewurl} -
How to refer to the \texttt{viewdir} in a URL.
\item \texttt{webmaster} (REQUIRED) -
Who's in charge of these web-pages, an email address to get stuffed into
mailto URLs.
\item \texttt{logourl} -
Where is the logo for this site
\item \texttt{homeurl} -
ehere is home for this site
\item \texttt{topurl} -
where top goes for this site
\item \texttt{rrdcgi} -
where to find the rrdcgi program, I like to link it and rrdtool into
\texttt{/usr/local/bin}, for ease of use.
\item \texttt{motdfile} -
where to find the Message-Of-The-Day file. This is used to add in
announcements at the top of the index pages, except the host index.
\end{itemize}
\subsection{"How-To's" }%
\index{"How-To's" }
\begin{itemize}
\item \texttt{thumbnail} -
How big the graph portion of a thumbnail image is to be (WIDTHxHEIGHT)
\item \texttt{metadata} -
Where to store CERN-style meta-data, to set expiry times for the gifs.
(METADIR METASUFFIX)
\item \texttt{background} -
what should the background look like. It's mostly obsolete, because
you can get the most of the same effects by editing the \texttt{default.css}
style file instead.
\item \texttt{htmlrefresh} -
How often to cause the web pages to refresh themselves. (seconds)
\item \texttt{upstatus}, \texttt{upunstablestatus}, \texttt{downunstablestatus}, \texttt{downstatus},
\texttt{okstatus}, \texttt{warnstatus}, \texttt{errorstatus}, \texttt{criticalstatus} -
HTML to display for various statuses. The defaults use $<$span style="xxx"$>$ tags.
\item \texttt{viewindices} -
Should view-writer (see the view-writer section) write the index links at the top of view pages?
(yes or no)
\item \texttt{showinterfaces} -
Should graph-writer (see the graph-writer section) show interfaces on a host page?
(yes or no)
\item \texttt{keepimages} -
How long cleanup (see the cleanup section) will permit old images to hang around, in seconds.
\item \texttt{default-tools} -
What tools to show for a host which doesn't specify any.
\end{itemize}
\subsection{Markers }%
\index{Markers }
This group supplies html to wrap various things in the generated web-pages.
\begin{itemize}
\item indexprefix, indexsuffix - for the items on the \texttt{Indices} line
of the header
\item groupprefix, groupsuffix - for the group names on the various indices
\item hostprefix, hostsuffix - for the host names on the various indices
\item toolprefix, toolsuffix - for the tool names on the toolbar
\item linkprefix, linksuffix - for the links in the footer
\item outofrangeprefix, outofrangesuffix - for the current value on the
availability pages when it has gone outside the specified bounds. (See
availability-report (see the availability-report section) .)
\end{itemize}
\subsection{Labels }%
\index{Labels }
If you translate the labels, the web-pages should be translated. It
doesn't include error-messages or debugging messages.
The currently available ones, with their defaults, are:
\begin{verbatim}%
alertreport Alert Report
comment Comment
contact Contact
customindex Custom Index
description Description
groupindex Group Index
hardware Hardware
hostindex Host Index
indices Indices
ipnumber IP \#
lastupdateon This page last updated on
links Links
logreport Log Report
operatingsystem Operating System
overallindex Overall Index
pingindex Ping Index
quickindex Quick Index
status Status
tools Tools
uptime Uptime
viewindex View Index\end{verbatim}
And also the:
\begin{itemize}
\item \texttt{uptimeflag} - shows on some index pages when a host has
been up for less than \texttt{mintime} (defined in the
general (see the general section) file.)
\item \texttt{alertflagwarn}, \texttt{alertflagerror} and \texttt{alertflagcritical} -
give HTML to be inserted in the quick index for hosts which have
alerts active.
\end{itemize}%------------------------------------ configfile-links.pod ---
\section{Configuration - Links}%
\index{Configuration - Links}
The links config-file supplies links that you want to put with the standard
links at the bottom of the web pages. They're in two pieces: the text
to be shown and the URL to link to.
An example:
\begin{verbatim}%
SourceWorks http://www.sourceworks.com/\end{verbatim}
%------------------------------------ configfile-oids.pod ---
\section{Configuration - OIDs}%
\index{Configuration - OIDs}
[These are for SNMP and you can ignore this config-file if you're not interested.]
The SNMP implementation in the snmp-collector is primitive and only knows
about OIDs (Object IDs) by their number. Since I'm not interested in bringing
in a full MIB compiler to deal with the MIBs, this section lets you specify
names for the OID numbers you're interested in using later. The lines look
like:
\begin{verbatim}%
CiscoCpuLoad 1.3.6.1.4.1.9.2.1.58.0\end{verbatim}
for a non-hypothetical example, if you happen to have Cisco routers. If you
have the ucd-snmp package, their snmptranslate program comes in handy for
pulling out the appropriate numbers without the bother of tracking through
the wretched MIBs.
%------------------------------------ configfile-remotepings.pod ---
\section{Configuration - remotepings}%
\index{Configuration - remotepings}
The \texttt{remotepings} file simply lists all the machines which are
running the remoteping-server. Or at least all the machines
that you want to query.
%------------------------------------ configfile-rrds.pod ---
\section{Configuration - RRDs}%
\index{Configuration - RRDs}
These files are the most complicated. Here's an example, again
taken from the \texttt{if-} rrd supplied with remstats.
\begin{verbatim}%
source unix-status
step 300
data in=interface\_packets\_in:* COUNTER:600:0:U
data ierr=interface\_errors\_in:* COUNTER:600:0:U
data out=interface\_packets\_out:* COUNTER:600:0:U
data oerr=interface\_errors\_out:* COUNTER:600:0:U
data coll=interface\_collisions:* COUNTER:600:0:U
alert in $<$ 100
alert out $<$ 100
alert in nodata WARN
archives day-avg week-avg month-avg 3month-avg year-avg
times day yesterday week 3month year
graph if-* desc='Interface data for \#\#RRD\#\#'
--title 'Interface \#\#RRD\#\# \#\#GRAPHTIME\#\#'
--lower-limit 0
--vertical-label 'packets'
DEF:in=\#\#DB\#\#:in:AVERAGE
DEF:out=\#\#DB\#\#:out:AVERAGE
DEF:ierr=\#\#DB\#\#:ierr:AVERAGE
DEF:oerr=\#\#DB\#\#:oerr:AVERAGE
'LINE1:in\#\#\#COLOR1\#\#:Input Packets'
'LINE1:out\#\#\#COLOR2\#\#:Output Packets'
'LINE1:ierr\#\#\#COLOR3\#\#:Input Errors'
'LINE1:oerr\#\#\#COLOR4\#\#:Output Errors'\end{verbatim}
This example shows most things that can be done, except multiple graphs on
the same rrd, which is as simple as adding another graph line and its
definition.
First, the rrd name is special, in this case. Any rrd file which ends in a '-'
is assumed to be for a wildcard rrd, in this case \texttt{if-*}. This avoids
problems with file-systems which are overly fussy about which characters can
be in file-names.
This rrd definition
will match any rrd beginning with 'if-' specified in a host config-file.
Wildcard rrds are necessary when a given host may have more than one of
whatever the rrd is referring to, in this case network interfaces. The
network interface name will replace the '*' in the rrd line in the host config-file.
It will also be available in the \texttt{\#\#WILDPART\#\#} magic cookie (see the magic cookie section) .
The \texttt{source unix-status} means that this RRD gets its data from the
unix-status-collector (see the unix-status-collector section) .
The \texttt{step} line sets the step value for the rrd. This is the expected
frequency of data updates. (See the manpage for
rrdcreate.) N.B. Setting this is required, but changing some RRDs won't
change how often the collectors run. If you have significant numbers
which require different update periods, you've got a choice. If it's
not very "expensive" to do those queries every time, then just ignore any
complaints from run-remstats about updates failing. Otherwise it gets messy.
You've got to set up three separate config-dirs. One for one time period,
and one for the other running out of cron at appropriate time-periods only
running collectors, and a separate one to run the monitors and pagemakers.
(FIXME - the writing stinks)
The \texttt{data} lines define various DS elements for this RRD. [See the
manpage for rrdcreate.] The first part is the DS name, with an extension.
The collectors produce long names and may have instance-names added to the
variable name, in this case to tell which interface this data is for. So
the first part looks like \texttt{dsname=variable:instance}. The
\texttt{dsname} is used for the RRD DS name and the \texttt{variable:instance}
part is used to tell updater which collector information applies to this DS.
The rest of the line is straight from rrdcreate's description of DS.
It's also possible to invoke configuration-supplied private functions (see the private functions section)
on the incoming raw data. The \texttt{data} line would look like:
\begin{verbatim}%
data xyzzy=\&function(variablename) ...\end{verbatim}
It's your responisbility to make sure that \texttt{function} is available and that it
works.
The \texttt{alert} lines are setting the thresholds for alerts, in this
case for the variables \texttt{in} and \texttt{out}. They must
specify, in order: the variable-name, the relation ($<$, =, $>$, delta$<$ and delta$>$)
and a space-separated list of thresholds. Since these ones only
provide one number each, they can only have OK or WARN statuses. If the
variables \texttt{in} or \texttt{out} have values less than ($<$) 100, they
are considered to be OK. Otherwise they're elevated to WARN status.
What will happen when they go into WARN status depends on the
alerts (see the alerts section) file. These alerts will apply to any host
which uses this rrd, unless it overrides it.
The last alert specifies that missing data for the variable \texttt{in} will be
considered to be status \texttt{WARN}, for purposes of generating alerts.
The full description of the alerts is kept in te docs for alert-monitor (see the alert-monitor section)
as it is the program which implements them.
The \texttt{archives} line tells how to keep the data for this rrd, using
the names defined in the archives (see the archives section) file.
There can be multiple \texttt{graph} lines describing as many graphs from
the data in this rrd as you want. The graph-name must be wildcarded if the
rrd is. A \texttt{graph} line is followed by its definition which must be
indented. The definition is straight from rrdgraph with the
magic cookie (see the magic cookie section) substitution. If you want a
description , you can add:
\begin{verbatim}%
desc='whatever you want'\end{verbatim}
or
\begin{verbatim}%
desc="whatever you want"\end{verbatim}
to the \texttt{graph} line. This is used to set the alt text on the web-page.
\subsection{Collector-specific Stuff}%
\index{Collector-specific Stuff}
An rrd collected by the port-collector (see the port-collector section) may specify that this particular
service is critical, by simply including the word "critical" at the end of
line. This will cause the status to be elevated to CRITICAL status if
the status ever reaches ERROR level.
An rrd collected by the log-collector (see the log-collector section) will have extra stuff
on each data line after the DS information. The extra stuff will be
the function and pattern needed by log-collector to pass to the
log-server (see the log-server section) to get that variable's data.
An RRD collected by the snmp-collector (see the snmp-collector section) needs to specify which OIDs to fetch.
They are specified by name in the RRD with a line like:
\begin{verbatim}%
oid APCUpsAdvInputLineVoltage\end{verbatim}
which refers to a name defined earlier in the oids (see the oids section) file.
An RRD collected by the snmp-collector may also specify an SNMP port to use
with a line like:
\begin{verbatim}%
port 3401\end{verbatim}
%------------------------------------ configfile-scripts.pod ---
\section{Configuration - Scripts}%
\index{Configuration - Scripts}
The \texttt{script XXX} files are describing how to query a given port for
its status and are used by the port-collector (see the port-collector section) . They look like:
\begin{verbatim}%
send GET / HTTP/1$\backslash$.0$\backslash$n$\backslash$n
timeout 5
port 80
infopattern \^Server:$\backslash$s+(.*)\$
valuepattern \^Content-length:$\backslash$s*($\backslash$d+)
ok \^HTTP/$\backslash$d$\backslash$.$\backslash$d 200
warn \^HTTP/$\backslash$d$\backslash$.$\backslash$d [45]$\backslash$d$\backslash$d\end{verbatim}
This example is taken from the supplied \texttt{config-base} and queries an
HTTP server for its root page. First, it sends the "send" text, which
in this case is a minimal HTTP request, and waits no more than 5 seconds.
After the port is closed from the remote end, or the timeout expires,
any text which was returned is examined by the various tests. In this
case, if the web-server sends back a line beginning something like
"HTTP/1.1 200", the port will be marked as "OK". Similarly, there are
"warn", "error" and "critical" statuses possible.
The \texttt{port} is optional and \texttt{getservbyname} will be called on the
script name, if port isn't specified. This also lets you have multiple
scripts for the same port, using different names for the script.
The \texttt{infopattern} is optional, and supplies a pattern which will be matched
against each line in the result. If there is a match, files will be created
in the data directory for that host called \texttt{INFOn-rrdname}, where \texttt{n} will
be in the range 1..9 and \texttt{rrdname} will be the name of this rrd, converted to
a file-name. The files will contain matches for parenthesised items in the
regular expression. E.G. in the example above, a file will be created called
\texttt{INFO1-http} which will contain whatever the web-server said its
type and version was.
Similarly, the \texttt{valuepattern} is also optional, but the matches will be returned
as collected items called \texttt{value1} through \texttt{value9}. In the example, this
would cause the collector to return a line like:
\begin{verbatim}%
hostname timestamp value1 1022\end{verbatim}
An RRD definition could use this by including a line like:
\begin{verbatim}%
data pagesize=value1 GAUGE:600:0:10000\end{verbatim}
For a working example, look at the RRD definition for weathernetwork.
%------------------------------------ configfile-times.pod ---
\section{Configuration - Times}%
\index{Configuration - Times}
The \texttt{times} file specifies time intervals for which graphs will be made.
I suppose it should be renamed graphtimes or something, but I've got other
things to do. Each line is in three pieces: a time name, a start time and
and end time. The times are relative to the current time and so will always
be non-positive.
The currently defined times are:
\begin{verbatim}%
thumb -60*60*2 0
day -60*60*24 0
week -60*60*24*7 0
month -60*60*24*30 0
3month -60*60*24*30*3 0
year -60*60*24*365 0
yesterday -60*60*24*2 -60*60*24
lastweek -60*60*24*7*2 -60*60*24*7\end{verbatim}
\section{Note: }%
\index{Note: }
The times \texttt{thumb} and \texttt{day} are special. The graph-writer (see the graph-writer section) expects
them to exist and to have certain meanings. The \texttt{thumb} time is a short
interval which is used to make the ping thumbnail graphs for the ping index.
The \texttt{day} time is the default time interval. The higher-level pages will
use the \texttt{day} graph as a link to the other time intervals.
%------------------------------------ configfile-tools.pod ---
\section{Configuration - Tools}%
\index{Configuration - Tools}
The \texttt{tools} file is only used by graph-writer (see the graph-writer section) to create toolbars. Each
line is in two pieces: a tool-name and a URL to link to for this tool.
The URL can have magic cookies (see the magic cookies section) in it to substitute
in things like hostname. Currently, the only cookies which will get
substituted here are \texttt{HOST}, \texttt{IP} and \texttt{HTMLURL}. If you think of
other usefull ones, please tell me (see the tell me section) .
%------------------------------------ configfile-views.pod ---
\section{Views - your own selection of graphs on one page}%
\index{Views - your own selection of graphs on one page}
The views config-dir contains files, one per view, describing a collection
of things that you want to see on one page. There are three kinds:
\begin{itemize}
\item \textbf{simple} - you specify which graphs or customgraphs you want, using
lines like:
\begin{verbatim}%
graph hostname rrdname graphname
customgraph customgraphname\end{verbatim}
You can have as many lines as you want, and can mix graphs and customgraphs.
The order you list them in is the order they will appear in the resultant
page.
\item \textbf{template} - you specify a view template to use to generate the page.
See the docs on view template (see the view template section) for explanations.
\item \textbf{datapage} - you specify a datapage to use to generate the page.
See the docs on datapage.cgi (see the datapage.cgi section) for explanations.
\end{itemize}
You can also specify a \texttt{description} line, like:
\begin{verbatim}%
desc This is what I'm taking about\end{verbatim}
Where the pages are generated is dependant on the \texttt{viewdir} and \texttt{viewurl}
directives in the html config-file (see the html config-file section) . The view pages may
have the usual indices on them, if the html config-file (see the html config-file section)
includes:
\begin{verbatim}%
viewindices yes\end{verbatim}
but by default leave them off.
%------------------------------------ configfile-view-templates.pod ---
\section{View Templates}%
\index{View Templates}
View templates give you complete control over page layout in a view. They are
complete HTML pages with embedded magic cookies, which are substituted for
during view generation (by view-writer (see the view-writer section) ). The resulting page will
be an \texttt{rrdcgi} CGI script. The magic cookies are:
\begin{itemize}
\item $<$VIEW::GRAPH hostname rrdname graphname [graphtime]$>$ -
This inserts a graph definition for \texttt{rrdcgi}. The graphtime is from the
times config-file (see the times config-file section) , and is optional.
\item $<$VIEW::CUSTOMGRAPH customgraphname [graphtime]$>$ - This inserts a customgraph
definition for \texttt{rrdcgi}. The graphtime is from the
times config-file (see the times config-file section) , and is optional.
\item $<$VIEW::INCLUDE filename$>$ - This inserts the contents of the named file when
the view-page is displayed. (Using the \texttt{rrdcgi} cookie $<$RRD::INCLUDE filename$>$.)
\item $<$VIEW::HEADER title here$>$ - Inserts a standard remstats header.
\item $<$VIEW::FOOTER$>$ - Inserts a standard remstats footer.
\item $<$VIEW::STATUS host status-file$>$ - inserts the contents of the named status-file when
the view-page is displayed. (Using the \texttt{rrdcgi} cookie $<$RRD::INCLUDE filename$>$.)
\end{itemize}
You can also include \texttt{rrdcgi} magic cookies.
%------------------------------------ config-tools.pod ---
\chapter{Configuration Tools}
\index{Configuration Tools}
These tools are intended to help you build the hosts part of your
configuration file. They take a file (or files) of hostnames and
emit host config-files for them. There are currently no config generators
for the log-collector, the remoteping-collector or the unix-status-collector.
\begin{itemize}
\item split-config (see the split-config section) - converts old config-files to new config-dirs (see the config-dirs section)
\item new-config (see the new-config section) - makes a new config-dir populated by symlinks to config-base
\item new-ping-hosts (see the new-ping-hosts section) - adds a hosts with a ping rrd
\item new-port-hosts (see the new-port-hosts section) - adds hosts which are collected by the port-collector (see the port-collector section)
\item new-snmp-hosts (see the new-snmp-hosts section) - adds hosts which are collected by the snmp-collector (see the snmp-collector section)
\item new-unix-hosts (see the new-unix-hosts section) - adds hosts which are running the unix-status-server (see the unix-status-server section)
\item nt-discover (see the nt-discover section) - finds and adds Windows NT hosts
\item snmp-showif (see the snmp-showif section) - shows interfaces from SNMP
\item snmp-get (see the snmp-get section) - for testing if you can get a particular OID
\end{itemize}%------------------------------------ split-config.pod ---
\section{split-config - convert a config-file to a config-dir}%
\index{split-config - convert a config-file to a config-dir}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
split-config version 1.4
usage: ../split-config [options] configfile configdir
where options are:
-d enable debugging output
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This is the conversion program to convert an old-style config-file (everything in
one file) into the new-style config-dir (a directory containing files and
sub-directories. For the layout of the new config-dir, see the $<$docs|configuration$>$.
It also deals with the other configuration changes which came with version 0.12.1:
\begin{itemize}
\item the [html (see the html section) ] group is now documented and \texttt{split-config}
will create it in the likely case that it's missing.
\item All the web-page creation stuff that was in the [general (see the general section) ]
section has been moved to the html config-file (see the html config-file section) .
\item The [groups] section has been separated out into the groups (see the groups section)
file.
\end{itemize}%------------------------------------ new-config.pod ---
\section{new-config - make a new config-dir}%
\index{new-config - make a new config-dir}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
new-config version 1.9
usage: new-config [options] new-config-dir
where options are:
-d ddd set debug level to 'ddd'
-f fff use 'fff' as config-base [/home/remstats/etc/config-base]
-h show this text
\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{new-config} makes a new config-dir and populates it with symlinks
to \texttt{/home/remstats/etc/config-base}. It makes the \texttt{scripts} and \texttt{rrds}
subdirectories as symlinks too. It makes the \texttt{customgraphs} and
\texttt{hosts} subdirectories as real directories. If you disagree
with which ones it chooses as symlinks, look at the top of the
program at @links and @subdirs.
You are likely to be modifying the following files, so they are
copied instead of being links:
\begin{verbatim}%
alerts alert-destination-map general html links tools\end{verbatim}
%------------------------------------ new-ping-hosts.pod ---
\section{new-ping-hosts - add ping RRD to host definition}%
\index{new-ping-hosts - add ping RRD to host definition}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
new-ping-hosts version 1.9
usage: new-ping-hosts [options] group [hostfile ...]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' as config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
You supply a list of files containing hostnames, one per line.
If there is no hostfile supplied, then it will read from stdin.
If there is no host config-file for that host, \texttt{make-ping-hosts}
will write entries like:
\begin{verbatim}%
\# hosts/www.example.com
desc gggg host
group gggg
ip 123.456.789.123
tools ping traceroute
rrd ping\end{verbatim}
for that host. If that host already exists, it will simply add:
\begin{verbatim}%
rrd ping\end{verbatim}
to the end of the hosts file. It doesn't check if the host already
has a ping rrd.
%------------------------------------ new-port-hosts.pod ---
\section{new-port-hosts - add RRDs for services}%
\index{new-port-hosts - add RRDs for services}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
new-port-hosts version 1.9
usage: new-port-hosts [options] group [hostfile ...]
where options are:
-d enable debugging output
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
You supply a file or files of hostnames, one per line, or let
it read from stdin.
You can use this to add RRDs to a host describing
the various services running on a server, or at least the ones that the
port-collector (see the port-collector section) knows how to talk to. It's actually a very limited
port-scanner, and will attempt to connect to each of the
services. For each one that answers, \texttt{new-port-hosts}
will write an entry for the corresponding rrd.
If the host has no file, \texttt{new-port-hosts} will add one with appropriate
header info and a ping rrd. Otherwise, it will just add the port-based
rrd's to the end of the host file.
%------------------------------------ new-snmp-hosts.pod ---
\section{new-snmp-hosts - add RRDs collected by snmp-collector}%
\index{new-snmp-hosts - add RRDs collected by snmp-collector}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
new-snmp-hosts version 1.12
usage: new-snmp-hosts [options] group community-string [hostfile ...]
where options are:
-d enable debugging output
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
if you don't supply a file of hostnames, then it will read from stdin.
You can use \texttt{new-snmp-hosts} to add RRDs collected by the snmp-collector.
It works by looking for a single OID for each RRD. If the OID exists
on a given host (i.e. it returns data), then the RRD which uses that
data is added to the host. There is currently no way to configure it
except for modifying the code. Complain if you'd actually use such a thing.
It's up to you to discard any you don't want.
If there is no hosts file, it will create one with default
header info. If there is one, it will just append the rrds.
%------------------------------------ new-unix-hosts.pod ---
\section{new-unix-hosts - add rrds collected by the unix-status-collector}%
\index{new-unix-hosts - add rrds collected by the unix-status-collector}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
new-unix-hosts version 1.6
usage: new-unix-hosts [options] group [hostfile ...]
where options are:
-d enable debugging output
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
-p ppp use port 'ppp' [1957]
-t ttt use 'ttt' for timeout [10]
\end{verbatim}
\subsection{Description: }%
\index{Description: }
If you don't supply a file of hostnames, then it will read from stdin.
This will add all the remstats distributed rrds collected by the
unix-status-collector (see the unix-status-collector section) , except for those using the \texttt{PS} section
of the collector.
It's up to you to discard any you don't want.
If there is no existing host file for a given host, it will create one
with default header info. If there is one, it will just append the
new rrds.
%------------------------------------ nt-discover.pod ---
\section{nt-discover - find and add new NT hosts}%
\index{nt-discover - find and add new NT hosts}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
nt-discover version 1.4
usage: ../nt-discover [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for the config-dir [/home/remstats/etc/config]
-h show this help
-s update status files even if host data-dir found
-t ttt use 'ttt' as timeout (in seconds) [10]
\end{verbatim}
\subsection{Description: }%
\index{Description: }
Note: \texttt{Nt-discover} is not called \texttt{new-nt-hosts} because it is a different kind of
program. Instead of you providing a list of hosts to add, it finds them itself.
Using information supplied in the discovery config-file (see the discovery config-file section) , \texttt{nt-discover}
will contact a host running the nt-status-server (see the nt-status-server section) , and run three separate queries:
\begin{itemize}
\item \texttt{NET-VIEW} will give a list of hosts to check
\item \texttt{USRSTAT} will give a list of NT users (currently unused, but may be interesting)
-item \texttt{SRVINFO} (for each of the hosts found in the first step) will give some more details
on each host and write new host config-files for each new one. It will not update an existing
host config-file.
\end{itemize}%------------------------------------ snmp-showif.pod ---
\section{snmp-showif - display interfaces from SNMP}%
\index{snmp-showif - display interfaces from SNMP}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
snmp-showif version 1.5
usage: ../snmp-showif [options] host community
where options are:
-d enable debugging output
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This utility can be used to get a list of interfaces on an SNMP queryable
host. You can use it to figure out which interfaces you want to add,
and what names remstats uses for them.
It will show ifIndex, ifDescr, ifSpeed, ifType, ifName, ifInOctets,
ifOutOctets, ifOperStatus and ifAlias (if it exists).
%------------------------------------ snmpif-description-updater.pod ---
\section{snmpif-description-updater }%
\index{snmpif-description-updater }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
snmpif-description-updater version 1.5
usage: snmpif-description-updater [options]
where options are:
-d ddd set debugging level to 'ddd'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help message
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This script updates host config-files, for those which contain \texttt{snmpif-*}
RRDs, by fetching the ifAlias OID via SNMP and re-writing the configuration
file if any of the descriptions have changed. I'd suggest running it every
now and then, say once a day, unless you're making very frequent changes to
the descriptions.
If somebody has gear which uses an OID other than ifAlias to store the
description, then I'll have to consider making this more general, but
it'll do for now.
%------------------------------------ servers.pod ---
\chapter{Servers}
\index{Servers}
\section{Servers }%
\index{Servers }
There are a four servers currently:
\begin{itemize}
\item unix-status-server (see the unix-status-server section) is queried by the unix-status-collector (see the unix-status-collector section)
for various information it obtains by running various
commonly-available programs: \texttt{df}, \texttt{vmstat}, \texttt{uptime}, \texttt{netstat},
\texttt{uname}, \texttt{ps} ...
\item log-server (see the log-server section) (queried by the log-collector (see the log-collector section) )
reads the unread portion of the specified log-file and returns the requested
statistics.
\item remoteping-server (see the remoteping-server section) is contacted by the remoteping-collector (see the remoteping-collector section) which
supplies a list of hosts. The server runs multiping (see the multiping section) against them and
returns results for each, similar to the results obtained from the
ping-collector (see the ping-collector section) .
\item nt-status-server (see the nt-status-server section) - provides access to information from Windows NT workstations and servers.
\end{itemize}%------------------------------------ log-server.pod ---
\section{log-server - providing remote access to log information}%
\index{log-server - providing remote access to log information}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
log-server version 1.8
usage: ../log-server [options] logfile ...
where options are:
-d nnn enable debugging output at level 'nnn'
-p ppp set the prefix for context-files to 'ppp' [log-server-]
-h show this help
\end{verbatim}
The log-server must be supplied with at least one log-file
to serve.
\subsection{Description: }%
\index{Description: }
The log-server is queried by the log-collector (see the log-collector section) using a
"protocol" described in the log-collector documentation.
It will provide information from any of the log-files on it's
command-line, but no others. It is recommended that you use
the tcp\_wrappers or some other form of access-control to
limit access to this server. The information may or may not
be sensitive, according to which log-files you are serving,
but letting anyone query it will mean that you will lose some
data, unless you're sure that they will only query it in test
mode.
The log-server will store context for each log-file
that is served, by default in \texttt{/var/tmp/log-server-XXX},
where \texttt{XXX} is replaced by a munged version of the log-file
name. If you want this stored somewhere else, use the \texttt{-p}
switch or change the program.
\subsection{Notes: }%
\index{Notes: }
Don't forget to list all the log-files that you want to serve on
the command-line. If there are too many for your inetd, make a
tiny shell script with the \texttt{log-server} invocation and run that
from inetd.
For details on installation, you'd better look at the
server installation docs (see the server installation docs section) .
%------------------------------------ nt-status-server.pod ---
\section{nt-status-server - allow remote gathering of Windows NT data}%
\index{nt-status-server - allow remote gathering of Windows NT data}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
usage: nt-status-server [options]
where options are:
-a show all available performance counters
-d ddd enable debugging at level 'ddd' [\$main::debug]
-h show this help message
-i [ppp sss] install this as an NT service, using 'ppp' as
as perl and 'sss' as this script. Defaults to
perl=C:$\backslash$Perl$\backslash$bin$\backslash$perl.exe
script=wherever-it's-invoked-from
-p ppp run server on port 'ppp' [1957]
-s run stand-alone, i.e. not as a service
-u un-install this service
N.B.: Just running this script will cause it to run as a service,
and when it stops, it will properly stop as a service.\end{verbatim}
\subsection{Description: }%
\index{Description: }
The nt-status-server allows the nt-status-collector (see the nt-status-collector section)
to get data from a remote machine running some flavour of NT and possibly
Windows 2000. It runs \texttt{SRVINFO} from the NT Resource Kit, to find the
version of NT and examines the NT performance counters for other information.
For details on installation, look at the server installation docs (see the server installation docs section) .
\subsection{Protocol: }%
\index{Protocol: }
The nt-status-collector (see the nt-status-collector section) connects to the \texttt{nt-status-server} and sends
a series of commands, ending with 'GO'. Then the server sends back the data
it obtained and closes the connection.
The commands are often the names of programs to run (in UPPERCASE) and
the ones known currently are:
\begin{itemize}
\item SRVINFO - runs SRVINFO and returns the version of NT
\item PERFCOUNTERS - examines the NT performance counters and returns
information about memory, disk, processes, ...
\item PULIST - runs PULIST (from the NT ResKit) and shows counts for all
the running processes.
\item MSDRPT - runs WINMSDP to show (currently) memory total and free.
\item USRSTAT - runs USRSTAT (from the NT ResKit) and shows when the various
users in the specified NT domain last logged in, and which domain-controller
authorized them.
\item NET-VIEW - runs "NET VIEW" to list the computers currently visible.
\item TIME - compares local and remote times
\end{itemize}
If you want to see what it returns, you can simply start it up and telnet to it.
\subsection{Installation }%
\index{Installation }
You'll have to use SRVANY to run it as a service until I figure out why the service
code doesn't work. Note that I've had to run the service under the local system account
to get it to be able to access most interesting info.
\subsection{Bugs }%
\index{Bugs }
\begin{itemize}
\item It is intended that it will eventually install itself as an NT service, and most
of the code is there, but it doesn't currently work. Patches gratefully accepted.
For now you have to invoke it with the \texttt{-s} switch to have it run stand-alone or use
SRVANY to provide the NT service stuff, (which also requires the \texttt{-s} flag.
\item Not only is is currently single-threaded (i.e. won't accept more than one connection
at a time), but if a second connection comes in the server won't answer any more requests
and will have to be re-started. I've added code to nt-status-collector (see the nt-status-collector section) so that it
won't run if there is another instance running already. This won't help if you're using
telnet to test the nt-status-server, so be prepared to restart nt-status-server if it
gets wedged because of this.
\end{itemize}%------------------------------------ remoteping-server.pod ---
\section{remoteping-server - allow remote collection of ping data}%
\index{remoteping-server - allow remote collection of ping data}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
remoteping-server version 1.5
usage: ../remoteping-server [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The remoteping-server allows the remoteping-collector (see the remoteping-collector section) to obtain ping
data remotely. Like the ping-collector (see the ping-collector section) , it uses multiping (see the multiping section)
to get ping data, but it can be queried from a remote site.
I'm looking for volunteers; please look at
this note (see the this note section) .
%------------------------------------ unix-status-server.pod ---
\section{unix-status-server - allow remote gathering of unix data}%
\index{unix-status-server - allow remote gathering of unix data}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
unix-status-server version 1.26
usage: ../unix-status-server [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-h show this help
-r include remotely-mounted file-systems
-t tst do tests 'tst, a comma-separated list of:
vmstat, df, uptime, netstat, uname, ps, proc,
ftpcount, netstat-tcpstates, fileage and qmailq
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The unix-status-server allows the unix-status-collector (see the unix-status-collector section)
to get data from a remote machine running some flavour of unix. It
runs several different programs on request (uname, vmstat, df, uptime,
netstat, ps, ftpcount, qmail-qstat, and qmail-qread).
\subsection{Protocol: }%
\index{Protocol: }
The unix-status-collector (see the unix-status-collector section) connects to the \texttt{unix-status-server} and sends
a series of commands, ending with 'GO'. Then the server sends back the data
it obtained by running the requested programs and closes the connection.
The commands are usually the names of programs to run (in UPPERCASE) and
the ones known currently are:
\begin{itemize}
\item UNAME runs uname and returns:
machine, os\_name, os\_release, os\_version
\item VMSTAT runs vmstat and returns variables relating to memory usage
depending on the operating system
\item DF runs df and for each file-system returns:
dfsize:FSNAME, dfused:FSNAME, dfpercent:FSNAME
inodessize:FSNAME, inodesused:FSNAME, inodespercent:FSNAME
\item UPTIME runs uptime and returns:
uptime (a timestamp in seconds), users, load1, load5, load15
\item NETSTAT runs netstat and, for each interface, returns:
interface\_packets\_in:IFNAME, interface\_errors\_in:IFNAME,
interface\_packets\_out:IFNAME, interface\_errors\_out:IFNAME,
interface\_collisions:IFNAME
\item PS runs ps and returns various numbers pulled out of the output
(see below)
\item FTPCOUNT runs ftpcount (from wuftpd) to find out which groups the
ftp-server's users fall into
\item QMAILQ runs qmail-qstat and qmail-qread and returns:
qmail\_qsize, qmail\_qbacklog, qmail\_qlocal, qmail\_qsite,
qmail\_qremote
\item FILEAGE returns timestamps for the ages of specified files
(see below)
\item TIME return the difference in time-stamps between the host running the unix-status-server
and the querying host. It must be given the querying host's timestamp following
the \texttt{TIME} directive. The two variables returned are time and timediff.
\end{itemize}
If you want to see what it returns, you can simply invoke the \texttt{unix-status-server}
as a local script and type commands at it.
\subsection{Programs: }%
\index{Programs: }
\begin{itemize}
\item \texttt{/usr/local/bin/uname} or \texttt{/usr/bin/uname}
It's cosmetic, for web-page header info, but sometimes it's
really usefull too.
\item \texttt{/usr/bin/vmstat} or \texttt{/usr/ucb/vmstat}
for scanrate, interrupts, context-switches and cpu-time, and
of course free memory and swap
\item \texttt{/usr/local/bin/df} or \texttt{/usr/xpg4/bin/df} or \texttt{/bin/df}
the gnu df. I need the -P flag (for Posix, but it makes df put
its info on one line), and the -i flag for inode info.
\item \texttt{/usr/local/bin/uptime} or \texttt{/usr/bin/uptime} or
\texttt{/usr/ucb/uptime}
the gnu uptime has a format I know how to parse. Others sometimes
invent new ways to be cute about the display that I don't always
recognise.
\item \texttt{/usr/bin/netstat} or \texttt{/usr/ucb/netstat}
to get network interface info
\item \texttt{/usr/bin/ps} or \texttt{/bin/ps}
for counting the number of running instances of a named process.
\item \texttt{/usr/local/bin/ftpcount} (part of wu-ftpd distribution) shows the number of ftp clients of wu-ftpd
from each access group.
\item \texttt{/var/qmail/bin/qmail-qstat} and \texttt{/var/qmail/bin/qmail-qread}
If you have qmail (see \textbf{http://www.qmail.org/}), these will
let you get information about the queue size, which you can't
find from the logs. Otherwise, ignore them.
\end{itemize}
\subsection{PS Usage:}%
\index{PS Usage:}
With extended commands, of which \texttt{PS} is the first, you also specify
what you want to look for with extra commands, in addition to the \texttt{PS}
command. A command looks like:
\begin{verbatim}%
varname PS func pattern\end{verbatim}
The \texttt{varname} is used to create a variable-name for the returned data.
The name will be \texttt{ps:varname}. \texttt{Func} is one of \texttt{count}, \texttt{sum},
\texttt{last}, \texttt{average}, \texttt{min}, \texttt{max}. \texttt{Pattern} is a perl-style
regular-expression, the simplest form of which is just a string.
For an example, if we wanted to know how many web-servers were running
over time, we might use (very sloppily):
\begin{verbatim}%
webservers PS count httpd\end{verbatim}
[You probably want a better regular expression.]
\subsection{FILEAGE Usage:}%
\index{FILEAGE Usage:}
For the \texttt{FILEAGE} command, you have to specify an extended command
that looks like:
\begin{verbatim}%
varname FILEAGE /path/to/file\end{verbatim}
This will produce a timestamp for the last-modification time of \texttt{/path/to/file}.
\subsection{Notes: }%
\index{Notes: }
With older versions of \texttt{vmstat}
(ones that mash fields together), it will give up on \texttt{vmstat} and not return
memory and CPU info. It also requires a version of \texttt{df} that will accept the
\texttt{-P} and \texttt{-i} flags. The \texttt{-P} flag forces the output
for a file-system to stay on one line (easier for me to parse) and the
\texttt{-i} returns info about inodes. If the \texttt{-i} flag is missing,
you won't get any inode data. You also won't get any inode data if
the file-system doesn't have inodes. (Duh :-).
For details on installation, look at the server installation docs (see the server installation docs section) .
%------------------------------------ collectors.pod ---
\chapter{Collectors}
\index{Collectors}
\section{Collectors Data Format}%
\index{Collectors Data Format}
All the collectors produce data on stdout in the same standard form:
\begin{verbatim}%
host timestamp variable value\end{verbatim}
If the \texttt{variable} is for something like network interfaces,
where the host can have several of them, the data will look like:
\begin{verbatim}%
host timestamp variable:instance value\end{verbatim}
Having all the collectors using a standard form permits a single
updating program, updater (see the updater section) , to process the data from them all, and
also means that I can write a new collector without needing to change
the updater.
\chapter{Collectors}
\index{Collectors}
\section{Remstats supplied collectors}%
\index{Remstats supplied collectors}
\begin{itemize}
\item log-collector (see the log-collector section) - gets info from remote
log-files
\item ping-collector (see the ping-collector section) - pings hosts
\item port-collector (see the port-collector section) - checks on remote services
\item remoteping-collector (see the remoteping-collector section) - pings hosts from somewhere else
\item unix-status-collector (see the unix-status-collector section) - gets info from unix hosts
\item snmp-collector (see the snmp-collector section) - gets info via SNMP
\item snmp-route-collector (see the snmp-route-collector section) - counts routes available from BGP peers
\end{itemize}
The usual invocation of a collector (via run-remstats (see the run-remstats section) ) is:
\begin{verbatim}%
xxx-collector | updater xxx\end{verbatim}
\chapter{Collectors}
\index{Collectors}
\section{How to write your own collector}%
\index{How to write your own collector}
There are a few requirements:
\begin{enumerate}
\item it must write its results to stdout in
standard form (see the top of this file.)
\item it must be placed in the same directory with the rest of
the collectors, and must be called XXX-collector, replacing "XXX"
with whatever it's collecting.
\item it must take (or at least ignore), the same arguments that
the other collectors do, specificly, the \texttt{-f}, \texttt{-u} and \texttt{-F} flags,
and the \texttt{-G} and \texttt{-H} flags, if I ever get around to implementing
them (see todo (see the todo section) ).
\item you must add it to the list of collectors (the \texttt{collectors}
line in the general config-file (see the general config-file section) ).
\item you must define rrd(s) specifying "source XXX" to use the data
from this collector.
\item you must add "rrd YYY" to the appropriate
host config-files (see the host config-files section) .
\end{enumerate}
There is a supplied \texttt{skeleton-collector.pl} file supplied with the
distribution, which does everything except collect data. You should
be able to plug your code into its \texttt{collect\_host} routine and have a
collector, if you don't mind writing in perl.
%------------------------------------ log-collector.pod ---
\section{log-collector - get stats from remote log-files}%
\index{log-collector - get stats from remote log-files}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
log-collector version 1.12
usage: ../log-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use config-dir 'fff'[/home/remstats/etc/config]
-F force collection, even if it's not time
-h show this help
-H HHH only do hosts from 'HHH', a comma-separated list
-p ppp contact log-server on port 'ppp' [1958]
-t ttt timeout each port attempt after 'ttt' seconds [10]
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The log-collector gets data from remote log-server (see the log-server section) 's. This way the
whole log-file doesn't have to be transfered. The "protocol",
if it deserves that name, is very simple. The collector sends a request,
which looks like (you can type it in via telnet):
\begin{verbatim}%
LOGFILE /wherever/the/logfile.is
varname function pattern
...
GO\end{verbatim}
The directives are all in \textbf{UPPERCASE}. They are \texttt{LOGFILE},
\texttt{GO}, \texttt{DEBUG} and \texttt{TEST}. The \texttt{LOGFILE} directive
tells the \texttt{log-server} which file to read. The \texttt{GO}
directive starts the request. \texttt{DEBUG}
causes some extra remote debugging output, and \texttt{TEST} makes the
\texttt{log-server} operate in test mode. In test mode it doesn't update the
last-read position for that log-file, so you won't lose any data when testing.
The other lines are telling the \texttt{log-server} what data to collect.
The first "word" is the variable name to be returned. The next is the function
to be applied (from \texttt{count}, \texttt{sum}, \texttt{average}, \texttt{min}, \texttt{max},
\texttt{first} and \texttt{last}). The rest of the line is a perl-style regex.
Except for the count function, the regex must contain a (parenthesized)
number, to which the function will be applied.
For example, the line:
\begin{verbatim}%
rootlogins count ROOT LOGIN\end{verbatim}
would return data for a variable called rootlogins. The value would be the count
of the records in the specified logfile which had the string 'ROOT LOGIN' in them.
The pattern can be much more complicated, for example (from the httpdlog rrd):
\begin{verbatim}%
bytes sum $\backslash$sHTTP/$\backslash$d$\backslash$.$\backslash$d"$\backslash$s+2$\backslash$d$\backslash$d$\backslash$s+($\backslash$d+)\end{verbatim}
This looks through a standard web-server log-file and extracts the bytes transferred
and adds them up to produce the total number of bytes transferred in that sample
period.
\subsection{How to make RRDs that use the log-collector}%
\index{How to make RRDs that use the log-collector}
It's easiest to explain by example. Look at the beginning of the rrd \texttt{httpdlog},
copied here:
\begin{verbatim}%
source log
step 300
data requests GAUGE:600:0:U count (GET|POST)
data success GAUGE:600:0:U count $\backslash$sHTTP/$\backslash$d$\backslash$.$\backslash$d"$\backslash$s+2$\backslash$d$\backslash$d
data bytes GAUGE:600:0:U sum $\backslash$sHTTP/$\backslash$d$\backslash$.$\backslash$d"$\backslash$s+2$\backslash$d$\backslash$d$\backslash$s+($\backslash$d+)\end{verbatim}
To form the requests to be sent to the log-server, the log-collector takes the
DS name, e.g. \texttt{success}, and the last part of the line after all the DS definition
\texttt{count $\backslash$sHTTP/$\backslash$d$\backslash$.$\backslash$d"$\backslash$s+2$\backslash$d$\backslash$d}, combines the two and sends:
\begin{verbatim}%
success count $\backslash$sHTTP/$\backslash$d$\backslash$.$\backslash$d"$\backslash$s+2$\backslash$d$\backslash$d\end{verbatim}
Note that the pattern can include magic cookies (see the magic cookies section) as of remstats version 0.12.2.
%------------------------------------ nt-status-collector.pod ---
\section{nt-status-collector - stats from Windows NT hosts}%
\index{nt-status-collector - stats from Windows NT hosts}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
nt-status-collector version 1.25
usage: ../nt-status-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection, even if it's not time
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-p ppp connect to server on port 'ppp' [1957]
-t ttt set timeout to 'ttt' [25]
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{nt-status-collector} gets its data from the nt-status-server (see the nt-status-server section) .
It sends a query consisting mostly of the sections of the server
that it wants to run. It can also request that processes with
specific names be counted and that information returned too.
A query for all the sections might look like:
\begin{verbatim}%
SRVINFO
PERFCOUNTERS
PULIST
MSDRPT
USRSTAT ntdomain
NET-VIEW
GO\end{verbatim}
Note that \texttt{PULIST}, \texttt{MSDRPT}, \texttt{USRSTAT} and \texttt{NET-VIEW} aren't currently
used by anything, but they may be usefull for something. Also, USRSTAT wants
an NT domain-name with the query.
%------------------------------------ ping-collector.pod ---
\section{ping-collector - get reachability of hosts}%
\index{ping-collector - get reachability of hosts}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
ping-collector version 1.16
usage: ../ping-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection even if it's not time
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-u for compatibility with run-remstats; ignored
\end{verbatim}
\subsection{Description: }%
\index{Description: }
ping-collector uses multiping (see the multiping section) to get numbers on how reachable the
hosts are. Each host is sent 10 pings (ICMP echo-request) and the
number of responses and the min, max and average RTT (Return
Trip Time) is logged, giving the variables ping-sent, ping-rcvd,
pingrtt-min, pingrtt-avg and pingrtt-max.
%------------------------------------ multiping.pod ---
\section{multiping }%
\index{multiping }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
usage: ../multiping/multiping [-Rdfnqrtv] [-c count] [-i wait] [-l preload]
[-p pattern] [-s packetsize] host
Options:
R: ICMP record route
d: set SO_DEBUG socket option
f: 'flood' mode
n: force addresses to be displayed in numeric format
r: set SO_DONTROUTE socket option
t: show results in tabular form
v: verbose mode for ICMP stuff
\end{verbatim}
\subsection{Description: }%
\index{Description: }
Multiping runs pings in parallel which permits you to ping lots of hosts
quickly. I wrote a program using Net::Ping to try to how bad a simple-minded
approach was. With .025s between pings in my programs and 1s in multiping,
my program took 17 minutes and multiping took 37 seconds. Maybe someone
will not be able to get multiping going and will re-write it in perl. Not
me. I'd be happy to include it if some-one wants to send me a copy.
%------------------------------------ port-collector.pod ---
\section{port-collector - get service status}%
\index{port-collector - get service status}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
port-collector version 1.15
usage: ../port-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection even if it's not time
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-t ttt set default timeout to 'ttt' [5]
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{port-collector} gets data about services running on
specified TCP ports. It will attempt to connect to the specified
port on the host, and optionally send a string to it. It will
then examine the response, if any, for certain patterns. This
permits it to query almost any text-based protocol. For information
on how to set up a new service, look in the
scripts (see the scripts section) directory in the configuration directory.
The rrd specification can also override the port specified by the script,
by appending a colon and the port-number. E.G. for a web-server
on port 8000, you could specify the rrd like:
\begin{verbatim}%
rrd port-http:8000\end{verbatim}
According to whether it can connect to the service and what response
it gets back, it sets a status to one of OK, WARN, ERROR, CRITICAL.
These are arbitrary levels, except that OK means normal, and their
meanings are determined by the configuration file. The
\texttt{port-collector} will also log how long it took the service to
respond. These numbers are not intended for benchmarking, but only
for determining the health of the service.
\subsection{Returned Data}%
\index{Returned Data}
The variables returned by the port-collector are:
\begin{verbatim}%
port-PORTNAME - containing the status of the port, calculated by
the script for this port
port-PORTNAME-response - containing the response-time for the query\end{verbatim}
\subsection{Other data from the port-collector}%
\index{Other data from the port-collector}
The main RRD for the port-collector is \texttt{port-*}, but it is possible to
define other RRDs. If you want to collect information from the results
elicited by the send string, you can provide either or both of the
\texttt{infopattern} or \texttt{valuepattern} in the script associated with the RRD.
The script must be named the same as the rrd. Look in the
scripts configfile docs (see the scripts configfile docs section) for details on scripts.
A matching \texttt{valuepattern} will cause the port-collector to return
variables named \texttt{RRDNAME:value\#} with "\texttt{\#}" replaced by a single digit,
corresponding to the number of the parenthesized part of the pattern that
was matched.
A matching \texttt{infopattern} will cause the port-collector to create status
files for this host called INFO1-RRDNAME. None of the existing pagemakers (see the pagemakers section)
use these status files, but the view-writer (see the view-writer section) could do so if a
view template (see the view template section) refered to them.
%------------------------------------ remoteping-collector.pod ---
\section{remoteping-collector - reachability from other sites}%
\index{remoteping-collector - reachability from other sites}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
remoteping-collector version 1.10
usage: ../remoteping-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
-p ppp use port 'ppp' instead of the default [1959]
-t ttt use 'ttt' for timeout [60]
-u for run-remstats compatibility
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{remoteping-collector} is intended to gather ping statistics
(see ping-collector (see the ping-collector section) ) from remote sites. It works by contacting
remoteping-server (see the remoteping-server section) s running on other machines. This way it will be
possible to monitor the same list of hosts from multiple points
on the network and determine several things:
\begin{itemize}
\item general health of parts of the network of interest to
the co-operating parties, and
\item if certain parts of the network are performing better or
worse than others.
\end{itemize}
\subsection{Note: }%
\index{Note: }
Note the use of the future tense in the previous paragraph. I'm
looking for volunteers to run the remoteping-server (see the remoteping-server section) and let me
have access to it. In return, I'll let you look at the
stats that this process gathers. I'm planning to monitor ISPs and
other sites across Canada, as well as commonly accessed sites around
the world, to determine how we're doing in network
performance, here in Canada. If you want to volunteer, please hit
the mailto URL below and ignore the bounce from
my spam protection (see \textbf{http://silverlock.dgim.crc.ca/\~{}terskine/qmail/tms.html}).
%------------------------------------ snmp-collector.pod ---
\section{snmp-collector - get data via SNMP}%
\index{snmp-collector - get data via SNMP}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
snmp-collector version 1.15
usage: snmp-collector [options]
where options are:
-c ccc use 'ccc' for the read community string; overrides host
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection even if it's not time
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The snmp-collector collects data available via SNMP. There are
some things that are hardcoded in, but it's mostly configurable.
It will attempt to query for the following, if available:
\begin{itemize}
\item \textbf{sysDescr} - tells what kind of a device this is
\item \textbf{sysUptime} - how long it has been up
\end{itemize}
and use them in the host index page.
The \texttt{snmpif-*} rrd is wired into the snmp-collector (for now) and
causes it to fetch the following for each interface:
\begin{itemize}
\item \textbf{ifType} - interface type
\item \textbf{ifOperStatus} - operational status
\item \textbf{ifSpeed} - interface speed
\item \textbf{ifInErrors} - input errors
\item \textbf{ifOutErrors} - output errors
\item \textbf{ifInOctets} - input octets (aka bytes)
\item \textbf{ifOutOctets} - output octets (aka bytes)
\item \textbf{ifInUcastPkts} - input unicast packets
\item \textbf{ifOutUcastPkts} - output unicast packets
\item \textbf{ifInNUcastPkts} - input non-unicast
(broadcast and multicast) packets
\item \textbf{ifOutNUcastPkts} - output non-unicast
(broadcast and multicast) packets
\end{itemize}
The sysDescr and sysUptime are saved for the host display and the
ifType and ifSpeed are combined to give a hardware description
for the interface. Crude, but portable.
For other SNMP data, you'll need to look at the
[oids] (see the [oids] section) file in the
configuration directory. The rrd will need to contain \texttt{oid}
lines specifying names assigned in the oids section (see the oids section section) .
If the host doesn't have one, the rrd will also need to specify
a \texttt{community}. Here's an example:
\begin{verbatim}%
[rrd snmpmem]
source snmp
step 300
data freemem=ciscofreemem GAUGE:600:0:U
data totalmem=ciscototalmem GAUGE:600:0:U
archives day-avg week-avg month-avg year-avg
times day yesterday week month year
oid CiscoFreeMem
oid CiscoTotalMem\end{verbatim}
This rrd definition will fetch the amount of free memory and total memory
available on a Cisco router. Since it's querying a Cisco-specific MIB,
it's not usefull on other gear.
%------------------------------------ snmp-route-collector.pod ---
\section{snmp-route-collector }%
\index{snmp-route-collector }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
snmp-route-collector version 1.12
usage: snmp-route-collector [options]
where options are:
-c ccc use 'ccc' for the read community string; overrides host
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection even if it's not time
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This is a specialized form of SNMP collector which walks a part of the BGP4
MIB, specificly \texttt{bgp4PathAttrBest}, to count routes available from a given
BGP peer host. It notes both the total number of routes and how many of them
are the best route to that destination.
Unfortunately, it doesn't scale. On routers with large numbers of peers
it can take a \textbf{long} time to troll through all the routes. This needs to
be replaced.
%------------------------------------ unix-status-collector.pod ---
\section{unix-status-collector - stats from unix hosts}%
\index{unix-status-collector - stats from unix hosts}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
unix-status-collector version 1.16
usage: unix-status-collector [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-F force collection even it it's not time yet
-h show this help
-H HHH only try hosts from 'HHH', a comma-separated list
-p ppp connect to server on port 'ppp' [1957]
-t ttt set timeout to 'ttt' [10]
-u ignore uphosts file
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The unix-status-collector gets its data from the unix-status-server (see the unix-status-server section) .
For a detailed explanation of what the various directives mean,
see its documentation, as it's the implementor.
It sends a query consisting mostly of the sections of the server
that it wants to run. It can also request that processes with
specific names be counted and that information returned too.
A query for all the sections might look like:
\begin{verbatim}%
UNAME
UPTIME
TIME 986485967
VMSTAT
DF
NETSTAT
QMAILQ
PS
webservers ps count httpd
FILEAGE
test fileage /var/spool/locks/lockfile
PROC
swaptot proc /proc/meminfo \^SwapTotal:$\backslash$s+($\backslash$d+)
GO\end{verbatim}
The \texttt{webservers ps count httpd} line requests that the ps section
count the number of processes called \texttt{httpd} and return
that as a variable called \texttt{webservers}.
The \texttt{test fileage /var/spool/locks/lockfile} line requests the last
modification time of the file \texttt{/var/spool/locks/lockfile}, which is
returned in seconds.
The \texttt{swaptot ...} line looks for the total swap size in \texttt{/proc/meminfo}.
The best way to see what it will produce is to run it manually.
%------------------------------------ updater.pod ---
\chapter{updater}
\index{updater}
\section{updater - add new data to RRDs}%
\index{updater - add new data to RRDs}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
updater version 1.9
usage: updater [options] collector
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
It reads collector output in standard form from stdin and updates the
appropriate RRDs. It wants to know which collector the information came
from to avoid looking for information that won't be there.
%------------------------------------ monitors.pod ---
\chapter{Monitors}
\index{Monitors}
\section{Monitors }%
\index{Monitors }
Currently, there are three monitors:
\begin{itemize}
\item ping-monitor (see the ping-monitor section) - determines reachability of hosts
\item alert-monitor (see the alert-monitor section) - figures out status of various values
specified in the rrds (see the rrds section) and hosts (see the hosts section) config-files
\item topology-monitor (see the topology-monitor section) - to analyze changing routes to your monitored hosts
\end{itemize}%------------------------------------ alert-monitor.pod ---
\section{alert-monitor - a status evaluator and alert trigger}%
\index{alert-monitor - a status evaluator and alert trigger}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
alert-monitor version 1.20
usage: ../alert-monitor [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use config-dir 'fff'[/home/remstats/etc/config]
-h show this help
-s sss search 'sss' data samples for values [5]
-u generate alerts for hosts unreachable through a down host
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{alert-monitor} compares the current value of variables specified in
the alerts file (see the alerts file section) in the configuration directory with
threshold values and sets the status of those variables accordingly.
It saves the current status of variables in \texttt{/home/remstats/data/ALERTS}.
What value corresponds to what status level is set in the
rrd definition (see the rrd definition section) or sometimes the
host definition (see the host definition section) . This way an rrd
definition will specify generally reasonable levels, but they can be
overridden for hosts where they aren't reasonable.
For an rrd definition, an alert line looks like:
\begin{verbatim}%
alert varname relation oklevel [warnlevel [errorlevel]]\end{verbatim}
or
\begin{verbatim}%
alert varname nodata status\end{verbatim}
[The latter says that missing data for variable \texttt{varname} will cause its status
to be level \texttt{status}.]
For a host-specified alert level, the line looks like:
\begin{verbatim}%
alert rrdname varname relation oklevel [warnlevel [errorlevel]]\end{verbatim}
or
\begin{verbatim}%
alert rrdname varname nodata status\end{verbatim}
and the interpretation is the same, except that you're having to say
which rrd this alert refers to.
The available relations are:
\begin{verbatim}%
$<$ (value is less than threshold)
$>$ (value is greater than threshold)
= (value is equal to threshold)
|$<$ (absolute value of value is less than threshold)
|$>$ (absolute value of value is greater than threshold)
delta$<$ (difference between last two values is less than threshold)
delta$>$ (difference between last two values is greater than threshold)
$>$daystddev (value is outside threshold * the past day's standard-deviation)
$>$weekstddev (value is outside threshold * the past day's standard-deviation)
$>$monthstddev (value is outside threshold * the past day's standard-deviation)\end{verbatim}
\subsection{Example }%
\index{Example }
To make things more concrete for the first (normal) case, here's a real example,
from the \texttt{load} rrd supplied in \texttt{config-base}:
\begin{verbatim}%
alert load5 $<$ 3 7 10\end{verbatim}
This means that if the \texttt{load5} variable is less than 3, the status is set to OK.
If it's less than 7, it's WARN, less than 10 it's ERROR and more than that, it's
CRITICAL.
Since the first match is taken, it's possible to leave out the upper levels if
you don't want them to ocurr. For example if you only wanted \texttt{load5} to ever
go to WARN level, never above, you could use:
\begin{verbatim}%
alert load5 $<$ 3\end{verbatim}
and then the only possible status levels are OK and WARN.
The possible \texttt{relation}s are: $<$, =, $>$, |$<$, |$>$, delta$<$, delta$>$. The first
three should be obvious. The next two allow comparisons to the absolute value of
the variable's current value. The last two allow comparisons to the change in
value.
\subsection{Causing alerts}%
\index{Causing alerts}
Depending on the lines in the alerts file (see the alerts file section) , the status may also
trigger alerts. A matching line in the alerts config-file (see the alerts config-file section) will cause
\texttt{alert-monitor} to run the alerter (see the alerter section) for each of the specified
recipients. It will also be passed, in order:
\begin{itemize}
\item \textbf{recipient} - the recipient; for alert-email (see the alert-email section) it
will be an email address
\item \textbf{hostname} - the name of the host that the alert applies to
\item \textbf{ip} - the IP number for that host, in case it's not in DNS
\item \textbf{rrdname} - the name of the RRD
\item \textbf{wildpart} - the wild part of a wildcard RRD. E.G, for an
RRD of \texttt{port-ftp} (using the wildcard RRD \texttt{port-*}) the wildpart
would be \texttt{ftp}.
\item \textbf{variable} - the name of the variable
\item \textbf{status} - the current status, as decided by alert-monitor
\item \textbf{old\_status} - the previous status
\item \textbf{value} - the current value of the variable
\item \textbf{relation} - the relation used to compare the variable to the
threshold, mostly for creating informative messages
\item \textbf{threshold} - the threshold value that was exceeded
\item \textbf{start} - timestamp of when the alert started
\item \textbf{duration} - number of seconds that the alert has been active
\item \textbf{host-description} - the description field from the host config-file
\item \textbf{rrd-description} - the description tag on this rrd (desc="xxx")
\item \textbf{webmaster} - the email address of the remstats person
\item \textbf{template} - the name of the template file to generate the message from.
\end{itemize}%------------------------------------ alerter.pod ---
\section{alerter - construct and send alert text}%
\index{alerter - construct and send alert text}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
alerter version 1.8
usage: alerter [options] args
where options are:
-d ddd set debugging output to level 'ddd'
-h show this help
-f fff use 'fff' as configuration directory [/home/remstats/etc/config]
The args are documented in alert-monitor; a quick list:
towho host ip realrrd wildpart var status old_status value relation
threshold alertstart duration hostdesc rrddesc webmaster template
\end{verbatim}
\subsection{Description: }%
\index{Description: }
[\texttt{Alerter} may be rolled into the \texttt{alert-monitor} at some point in
the future. It was easier to test as a separate program, and the performance
hasn't been an issue for me.]
\texttt{Alerter} is passed its parameters (specified above) by the alert-monitor (see the alert-monitor section) .
Most of them are used to fill in information in the text of the alert. The
interesting ones are \texttt{towho} and \texttt{template}.
It also reads the alert-destination-map config-file (see the alert-destination-map config-file section)
to decide where the alert needs to go. This will give it a list of (method, address)
pairs.
For a given template-name, say \texttt{xxx}, and method, say \texttt{method}, it will
look for files in /home/remstats/etc/config/alert-templates, called:
\begin{verbatim}%
method-xxx
method-DEFAULT
xxx
DEFAULT\end{verbatim}
and take the first one it finds. Similarly, it will look for a header to add to the
top of the template called:
\begin{verbatim}%
method-HEADER
HEADER\end{verbatim}
and a footer in one of:
\begin{verbatim}%
method-FOOTER
FOOTER\end{verbatim}
The three pieces will be concatenated giving the template text. Then substitutions
will be done for the following \#\#MAGICCOOKIES\#\#:
\begin{verbatim}%
HOST IP REALRRD WILDPART FIXEDRRD VAR STATUS OLDSTATUS
VALUE RELATION THRESHOLD START DURATION HOSTDESC RRDDESC
NOW TEXTNOW ALERTHOST TOWHO WEBMASTER\end{verbatim}
This gives the alert text. From the method definition in the
alert-destination-map config-file (see the alert-destination-map config-file section)
\texttt{alerter} knows which program to run to send the alert text to the
appropriate address, and it does it.
\subsection{Alert-Sending Scripts}%
\index{Alert-Sending Scripts}
These are now easy to write, and in many cases you won't even have to
write one. There are two requirements for an alert-sending script:
\begin{enumerate}
\item It must take an address to send to on the command-line, and
\item It must accept the text on stdin.
\end{enumerate}
E.G. you could use sendmail with no wrapper.
%------------------------------------ alert-email.pod ---
\section{alert-email - an alert sending script}%
\index{alert-email - an alert sending script}
This is a simple script intended to be run by the alerter (see the alerter section) .
Like all alert-senders, it takes one argument on the command-line:
the "address" to send the alert to. The text of the alert is
fed to this script on stdin.
It sends the alert text to "address" via email, by invoking sendmail,
though there's no reason that it couldn't be re-written to do an SMTP
injection directly if some-one wanted to. With no error-checking, it
could be re-written as:
\begin{verbatim}%
\#!/bin/sh
/usr/lib/sendmail "\$1"\end{verbatim}
%------------------------------------ alert-winpopup.pod ---
\section{alert-winpopup - an alert sending script}%
\index{alert-winpopup - an alert sending script}
This is a simple script intended to be run by the alerter (see the alerter section) .
Like all alert-senders, it takes one argument on the command-line:
the "address" to send the alert to. The text of the alert is
fed to this script on stdin.
It sends the alert to use "address", in this case a windows NetBIOS
machine name, via a Windows popup message.
%------------------------------------ ping-monitor.pod ---
\section{ping-monitor - determine reachability status}%
\index{ping-monitor - determine reachability status}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
ping-monitor version 1.5
usage: ../ping-monitor [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
-s sss examine 'sss' samples [5]
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{ping-monitor} looks at the last 5 samples (by default)
of ping data and determines the status of the host. It will choose
one of the following four statuses:
\begin{itemize}
\item \textbf{UP} - the host is up now and has always (throughout
the sample period) responded to pings.
\item \textbf{UPUNSTABLE} - the host is up now, but on at least
one of the samples, it did not respond.
\item \textbf{DOWNUNSTABLE} - the host is not responding now, but
it has responded within the sample period.
\item \textbf{DOWN} - the host is down now and has not responded
within the sample period.
\end{itemize}
It also writes coloring information for the ping-index web-page.
%------------------------------------ topology-monitor.pod ---
\section{topology-monitor }%
\index{topology-monitor }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
topology-monitor version 1.6
usage: topology-monitor [options] oldfile newfile
where options are:
-d enable debugging output
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{topology-monitor} runs out of do-traceroutes (see the do-traceroutes section) to find changes in
network paths to the monitored hosts. After \texttt{do-traceroutes} has run
traceroute for each monitored host, the \texttt{topology-monitor} compares the
current network path to that host with the previous path. Currently,
all that is done with this information is to log when it changes.
%------------------------------------ run-remstats.pod ---
\chapter{run-remstats}
\index{run-remstats}
\section{run-remstats - run a complete cycle}%
\index{run-remstats - run a complete cycle}
\chapter{run-remstats}
\index{run-remstats}
\section{Usage: }%
\index{Usage: }
\begin{verbatim}%
run-remstats version 1.9
usage: ../run-remstats [options] [-- options-to-be-passed]
where options are:
--debug=nnn enable debugging output at level 'nnn'
--config_dir=fff use 'fff' for config-dir [/home/remstats/etc/config]
--help show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{run-remstats} is the main script for a remstats collection machine.
As a simplified overview:
\begin{itemize}
\item check-config (see the check-config section) is run first.
\item In parallel, all the collectors (see the collectors section) are run, each feeding it's own
updater (see the updater section) process. Some of them query remstats servers (see the servers section) , some get their
information in other ways. So you have a bunch of pipelines like:
\begin{verbatim}%
xxx-collector | updater xxx\end{verbatim}
\item When all the collectors have finished, the monitors (see the monitors section) get run in
parallel to figure out what's happening.
\item Afterwards, if the configuration directory has changed, run the pagemakers (see the pagemakers section) ,
to re-do the web-pages.
\item Finally, it prints all the stderr output of all the various programs,
separated by program.
\end{itemize}
For each of these programs, \texttt{run-remstats} will set a timer (see \texttt{watchdogtimer}
in the general config-file (see the general config-file section) ). If the timer expires and the
program is still running, \texttt{run-remstats} will kill that process. This avoids the
problem of a hanging collector hanging the whole remstats cycle.
It also manages a lock-file to make sure that two instances don't run
concurrently. The lock-file's name is based on the name of the \texttt{run-remstats}
script. (See \textbf{Running multiple copies of run-remstats} below.)
It keeps a status file in the configured temp directory (\texttt{/home/remstats/tmp}
by default) which is used by monitor (see the monitor section) to show where the \texttt{run-remstats}
process has gotten to.
When starting, it will also look for a file in the tmp directory called
\texttt{STOP-run-remstats} (default), and if it exists, will refuse to run at all.
\subsection{Running multiple copies of run-remstats}%
\index{Running multiple copies of run-remstats}
If you symlink \texttt{run-remstats} to \texttt{run-remstats-XXX}, then the
default configuration directory for \texttt{run-remstats-XXX} will be
\texttt{/home/remstats/etc/config-XXX}. Since the lock-file is named for the script
which invokes it, you won't have collisions between
the two instances, as long as your configuration files don't
conflict. You can have multiple collector-only instances collecting
data which is formatted by a single pagemaker instance, (in theory)
but this will require at least three config-dirs which must be
closely co-ordinated. If you want to do this for performance
reasons, I do plan to address this in future.
\subsection{Configuration: }%
\index{Configuration: }
See the general (see the general section) config-file. The lines
to configure run-remstats are:
\begin{verbatim}%
pinger, collectors, monitors, pagemakers, watchdogtimer\end{verbatim}
%------------------------------------ check-config.pod ---
\section{check-config }%
\index{check-config }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
check-config version 1.14
usage: check-config [options]
where options are:
-c write SNMP communities
-C CCC set configuration-debugging output to level 'CCC'
-d ddd enable debugging output at level 'ddd'
-D dump configuration (must have DEBUG enabled in fixup.config)
-e print environment variables to stdout
-f fff use config-dir 'fff' [/home/remstats/etc/config]
-h show this help
-l lll list 'lll' on stdout (comma-separated list of: host, ip, rrd)
-s sss shell type to use for -e [sh]
-t test-mode; don't make any changes
\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{check-config} uses the common \texttt{read\_config} routine to
read the configuration file which will confirm that it can be
read successfully by other programs. It also makes sure that
the data directories for all hosts exist, and creates new RRDs.
run-remstats (see the run-remstats section) also uses the \texttt{-e} option to make basic configuration
information avaiable to the shell.
%------------------------------------ pagemakers.pod ---
\section{Pagemakers }%
\index{Pagemakers }
The remstats pagemakers make web-pages, and update other information used
in making web-pages. They only need to be run when the configuration file
has changed and run-remstats (see the run-remstats section) is smart enough to do that.
\begin{itemize}
\item graph-writer (see the graph-writer section) - makes web-pages with the graphs and links them together
\item snmpif-setspeed (see the snmpif-setspeed section) - sets maximums on snmpif-* rrds
\item datapage-interfaces (see the datapage-interfaces section) - makes datapages for every snmpif-* rrd
\item datapage-inventory (see the datapage-inventory section) - lists all monitored hosts, uptime, software and hardware
\item snmpif-description-updater (see the snmpif-description-updater section) - updates the descriptions for snmpif-* rrds
from the SNMP descriptions
\end{itemize}%------------------------------------ graph-writer.pod ---
\section{graph-writer }%
\index{graph-writer }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
graph-writer version 1.15
usage: ../graph-writer [options] collector
where options are:
-d nnn enable debugging output at level 'nnn'
-D enable configuration debugging output
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This is the main remstats pagemaker (see the pagemaker section) . It makes the
web-pages with the graphs and other pages to link them together and
organize them. There are three kinds of page that it makes:
\begin{itemize}
\item \textbf{Indices} - The main three indices are the \textbf{Overall Index}, the
\textbf{Ping Index} and the \textbf{Quick Index}. Each of these shows all the hosts
being monitored, grouped by the group you assigned them to. The
\textbf{Overall Index} shows a section for each host, with a link to all of the
graphs for that host. The \textbf{Ping Index} shows a small graph of the last
two hours of ping data for that host, for each host, with the graph background
specially coloured for hosts which aren't reachable. The \textbf{Quick Index}
shows, for each host: a link, a status indicator and optionally a link to
alert.cgi (see the alert.cgi section) for alerts for that host.
There is also the \textbf{Custom Index} to show links to all the
customgraphs (see the customgraphs section) .
\item \textbf{Host Pages} - For each host, there is a \textbf{Host Page} which shows
some information about the host and all the day graphs for that host. The
graphs are all links to ...
\item \textbf{Graph Pages} - Each graph is also available in various timespans,
depending on the times (see the times section) that you specified in
the rrd (see the rrd section) definition which caused the generation of
that graph.
\end{itemize}
Graph-writer is the replacement for both \texttt{grapher} and \texttt{html-writer}.
Before version 0.10.0, \texttt{grapher} would make new graphs, as part of the
update run, and \texttt{html-writer} would re-write the html pages. Now,
\texttt{graph-writer} makes a CGI script for each web-page, using \texttt{rrdcgi} as
its interpreter. \texttt{Rrdcgi} simply spits out the page as it was written
with "magic cookies" replaced by $<$IMG SRC...$>$ tags and makes sure that
there is a recent version of the graph file. Much better than generating
all the graphs every five minutes and have most of them never get looked at.
%------------------------------------ snmpif-setspeed.pod ---
\section{snmpif-setspeed }%
\index{snmpif-setspeed }
This is a gross hack which modifies all the \texttt{snmpif-*} rrds to set the
maximum limits on all monitored interfaces. The input and output bps
variables have their maximum set to the the maximum for that interface.
The various packet counters have their maximums set to the maximum possible
packets-per-second assuming minimum-length packets, for that interface
speed.
You may not want to run this if you have better knowledge of what real
maximums will be encountered for particular interfaces.
%------------------------------------ datapage-alert-writer.pod ---
\section{datapage-alert-writer }%
\index{datapage-alert-writer }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
Use of uninitialized value at ../datapage-alert-writer line 156.
datapage-alert-writer version
usage: ../datapage-alert-writer [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This writes a datapage (see datapage.cgi (see the datapage.cgi section) ) which gets linked into the header of
each page as the "Alert Index". It gives a quick overview of alert statuses and values for
each host. It ought to be self-explanatory.
%------------------------------------ datapage-interfaces.pod ---
\section{datapage-interfaces }%
\index{datapage-interfaces }
This makes a datapage (see the datapage section) for each host with \texttt{snmpif-*}
rrds, showing all those interfaces.
%------------------------------------ datapage-inventory.pod ---
\section{datapage-inventory }%
\index{datapage-inventory }
This makes a single page showing all the monitored hosts. For each host,
it shows:
\begin{itemize}
\item the uptime (from the \texttt{uptime} program or SNMP uptime)
\item the hardware type (from the \texttt{uname} program), if available
\item the software version (from the \texttt{uname} program or
the SNMP system.sysDescr)
\end{itemize}%------------------------------------ datapage-status.pod ---
\section{datapage-status }%
\index{datapage-status }
\subsection{Usage }%
\index{Usage }
\begin{verbatim}%
datapage-status version 1.4
usage: datapage-status [options] file ...
where options are:
-d enable debugging output
-e show run-time errors in generated pages
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description }%
\index{Description }
The \texttt{datapage-status} program creates a datapage (to be interpreted by
datapage.cgi (see the datapage.cgi section) ) showing the current values of all
variables in all RRDs for that host, in addition to the usual remstats
headers.
%------------------------------------ view-writer.pod ---
\section{view-writer }%
\index{view-writer }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The \texttt{view-writer} makes web-pages from the view definitions which are
contained in the views config-dir (see the views config-dir section) . All the documentation
relating to \texttt{view-writer} is there too, as it only implements what the views
request.
%------------------------------------ remstats-monitor.pod ---
\section{remstats-monitor - watch remstats processes}%
\index{remstats-monitor - watch remstats processes}
\subsection{Usage: }%
\index{Usage: }
\texttt{remstats-monitor [sleeptime]}
where sleeptime is the time (in seconds) to wait between polls
\subsection{Description: }%
\index{Description: }
This is primarily a development tool. It loops doing a \texttt{ps} command,
weeding out everything except the remstats processes and cleaning up the results,
to make it easier to read. It also shows the process-id from run-remstats (see the run-remstats section)
lock-file, and the status from its status file.
It's not written portably and will probably have to be tweaked by hand
if you want to run it. If you find it of interest,
please let me (see the me section) know.
%------------------------------------ cgis.pod ---
\section{CGI Scripts}%
\index{CGI Scripts}
These are intended to be invoked via the html-writer created toolbars, to
do the supplied functions to the host in question.
\begin{itemize}
\item alert.cgi (see the alert.cgi section) - Shows the current alert status of selected rrd variables.
\item availability-report.cgi (see the availability-report.cgi section) - Shows availability of RRD variables.
\item dataimage.cgi (see the dataimage.cgi section) - Generates images based on live data.
\item datapage.cgi (see the datapage.cgi section) - Generates web-pages containing dynamic data.
\item graph.cgi (see the graph.cgi section) - Allows non-remstats web-pages to show remstats graphs.
\item log-event.cgi (see the log-event.cgi section) - log a manual event.
\item ping.cgi (see the ping.cgi section) - Ping the host.
\item showlog.cgi (see the showlog.cgi section) - Display selected portions of the remstats log files.
\item traceroute.cgi (see the traceroute.cgi section) - find network path to a host
\item whois.cgi (see the whois.cgi section) - look up information about hosts, IP\#s, ...
\end{itemize}%------------------------------------ alert-cgi.pod ---
\section{alert.cgi - Alert Reporting and Updating}%
\index{alert.cgi - Alert Reporting and Updating}
This CGI script will generate the Alert Report and also let you modify it.
You can turn alerts off for a specific line (by checking the \texttt{quench}
check-box). You can also attach comments to a specific alert, for example
to let other people know that you're already working on it, or when a
service will be available again.
%------------------------------------ availability-report-cgi.pod ---
\section{availability-report.cgi }%
\index{availability-report.cgi }
The \texttt{availability-report.cgi} calls availability-report (see the availability-report section) to produce
a report of "availability" according to the definitions in the
availability (see the availability section) config-file.
%------------------------------------ dataimage-cgi.pod ---
\section{dataimage.cgi - create images driven by live data}%
\index{dataimage.cgi - create images driven by live data}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
$<$IMG SRC="http://remstats.sourceworks.com:1954/dataimage.cgi?imagename"$>$\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{Dataimage.cgi} reads an image definition from
\texttt{/home/remstats/datapage/imagename.image}. Some of the commands are in common
with datapage.cgi (see the datapage.cgi section) and are documented
there:
\begin{verbatim}%
oid, rrd, status, eval, debug, macro, macroend and *EOD*\end{verbatim}
These retrieve and manipulate data. There are also commands to create
images:
\begin{verbatim}%
image, colordef, color, linewidth, line, rectangle, circle,
fill, font, text, out, flow\end{verbatim}
\section{Image Commands}%
\index{Image Commands}
\subsection{image }%
\index{image }
The image command has two formats. The first looks like:
\begin{verbatim}%
image WIDTH HEIGHT\end{verbatim}
This creates a blank image of the size specified. Sometimes you'll want a
background for the image, and you can use the second form to specify
a file to read for the background:
\begin{verbatim}%
image BGFILE\end{verbatim}
This will create the new image the same size as the one in \texttt{BGFILE},
by reading \texttt{BGFILE} and using its contents as the background. N.B., the
image must be a PNG graphic.
The \texttt{image} command also defines a few colors (see \texttt{colordef} below):
\texttt{black}, \texttt{white} and \texttt{transparent}, sets the current color to
\texttt{black}, fills the image with \texttt{white} and sets the \texttt{linewidth} to 1.
\subsection{colordef }%
\index{colordef }
[It can also be spelled \texttt{colourdef}.]
This defines a new colour and names it. The command looks like:
\begin{verbatim}%
colordef COLORNAME RED GREEN BLUE\end{verbatim}
where \texttt{RED}, \texttt{GREEN} and \texttt{BLUE} specify the level of each of those colours
to be mixed to define the colour referred to in the script as \texttt{COLORNAME}.
\subsection{color }%
\index{color }
[It can also be spelled \texttt{colour}.]
This sets the current colour, to be used by those commands that don't specify
a colour. It is used as simply:
\begin{verbatim}%
color COLORNAME\end{verbatim}
\subsection{linewidth }%
\index{linewidth }
This sets the width of lines. It isn't honoured by all other commands,
unfortunately, but so far this hasn't been a problem for me. It looks like:
\begin{verbatim}%
linewidth WIDTH\end{verbatim}
\subsection{line }%
\index{line }
This just draws a line in the current \texttt{color} and \texttt{linewidth}:
\begin{verbatim}%
line X1 Y1 X2 Y2\end{verbatim}
\subsection{rectangle }%
\index{rectangle }
This is a way to draw a rectangle, without useing \texttt{line} 4 times:
\begin{verbatim}%
rectangle X1 Y1 X2 Y2 [filled]\end{verbatim}
The co-ordinates (\texttt{X1}, \texttt{Y1}) and (\texttt{X2}, \texttt{Y2}) define oposite corners
of the rectangle. If the keyword \texttt{filled} is added to the end, the
rectangle will be filled with the current colour as well.
\subsection{circle }%
\index{circle }
Here you get a circle:
\begin{verbatim}%
circle X Y RADIUS [filled]\end{verbatim}
The circle will be centered on (\texttt{X}, \texttt{Y}) with a radius of \texttt{RADIUS}.
If the keyword \texttt{filled} is added to the end, the
circle will be filled with the current colour as well.
\subsection{fill }%
\index{fill }
This command permits you to fill arbitrary regions:
\begin{verbatim}%
fill X Y [COLORNAME]\end{verbatim}
The \texttt{COLORNAME} is optional.
\subsection{text }%
\index{text }
The \texttt{text} command sets text into the image, for labelling things:
\begin{verbatim}%
text X Y TEXT\end{verbatim}
\subsection{font }%
\index{font }
This changes the font for the \texttt{text} command:
\begin{verbatim}%
font (giant|large|mediumbold|medium|small|tiny)\end{verbatim}
\subsection{out }%
\index{out }
This permits the script to output additional information to an auxiliary
file. I added this for doing image-maps automatically, which can be
automatically loaded by a datapage.cgi (see the datapage.cgi section) web-page.
The syntax is:
\begin{verbatim}%
out TEXT\end{verbatim}
\subsection{flow }%
\index{flow }
This draws a strange double-headed, bi-coloured arrow. Think of it as two
half arrows, split lengthwise, one in each direction. The colour and width
of each half arrow indicates the flow in that direction. I use it for
indicating network traffic flow, which usually isn't the same in both
directions. It looks like:
\begin{verbatim}%
flow X1 Y1 X2 Y2 INFLOW OUTFLOW\end{verbatim}
The co-ordinates (\texttt{X1}, \texttt{Y1}), (\texttt{X2}, \texttt{Y2}) indicate the ends of the
flow. \texttt{INFLOW} and \texttt{OUTFLOW} indicate the level in each direction,
relative to (\texttt{X1}, \texttt{Y1}).
%------------------------------------ datapage-cgi.pod ---
\section{datapage.cgi - dynamic data in web-pages}%
\index{datapage.cgi - dynamic data in web-pages}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
$<$A HREF="http://remstats.sourceworks.com:1954/datapage.cgi?pagename"$>$whatever$<$/A$>$\end{verbatim}
\subsection{Data Collection}%
\index{Data Collection}
\texttt{Datapage.cgi} looks for the page definition in the file
\texttt{@@DATAPAGEDIR@@/pagename.page} The page definition is in two parts,
separated by a line like:
\begin{verbatim}%
BEGIN-PAGE\end{verbatim}
The first part's purpose is to define variables to be included in the
second part, which is an HTML template, with magic cookies.
All lines in the first or definition part are subject to variable interpolation.
Any ocurrance of \texttt{\$\{variablename\}} will be replaced by the current
contents of the variable \texttt{variablename}. This will be done up to
five levels, permitting expansion of \$\{\$\{h\}\_\$\{interface\}\}, providing that
you've got values for the variables \texttt{h} and \texttt{interface}.
N.B. variable names must be lower case.
In addition, within a macro expansion, macro-arguments will also be
interpolated, before variable interpolation, for all ocurrances of
\texttt{\$\{ARGNAME\}}, assuming that there is an argument for the
current macro called \texttt{ARGNAME}. N.B. macro argument names must
be UPPER case.
\subsection{Common Commands}%
\index{Common Commands}
The commands permitted in the first part are:
\begin{verbatim}%
oid, rrd, status, eval, debug, macro, macroend,
alertstatus, alertvalue and *EOD*\end{verbatim}
These commands are in common with the dataimage.cgi (see the dataimage.cgi section)
script, but are only documented here.
\begin{itemize}
\item oid
This fetches an SNMP value into a datapage variable. The command looks like:
\begin{verbatim}%
oid VARNAME HOSTNAME OIDNAME\end{verbatim}
The \texttt{VARNAME} is the name of the datapage variable (let's just call
them variables from now on).
The \texttt{HOSTNAME} is the name of the host
to query. The SNMP community-string is usually supplied in
that host's config-file, but can be supplied in usual MRTG fashion by
giving \texttt{COMMUNITY@HOSTNAME} or even \texttt{COMMUNITY@HOSTNAME:PORTNUMBER}
instead of the \texttt{HOSTNAME}.
The \texttt{OIDNAME} must be defined in the
oids config-file (see the oids config-file section) , but can be suffixed by the usual
numbers. E.G. you can use ifName.4 to get the ifName for interface 4.
\item rrd
This fetches a value from an RRD database into a variable. It looks like:
\begin{verbatim}%
rrd VARNAME HOSTNAME RRDNAME DSNAME CF\end{verbatim}
The \texttt{RRDNAME} is the name of the rrd, as remstats knows it, not fully
qualified. I.E. it will be under the config-file defined \texttt{datadir},
and under the host's directory under that.
The \texttt{DSNAME} is the ds-name within that RRD file and the \texttt{CF}
is the usual RRD consolidation-function to be applied.
\item status
This is so-named because it fetches remstats status files, usually written
by the various collectors (see the collectors section) and monitors (see the monitors section) . It looks like:
\begin{verbatim}%
status VARNAME HOSTNAME STATUSNAME\end{verbatim}
The \texttt{STATUSNAME} is the name of the status file, as named in the
host's data directory. There is a standard mapping applied by the
function \texttt{to\_filename} from the \texttt{remstats.pl$<$/t} file to
munge the filename so that it won't conflict with the filesystem. Either
look for the name in the data directory, use the function (see eval) or
look at the code. I \textbf{am} planning on changing the mapping when I
figure out the best way to do it.
\item eval
The eval command lets you modify the values fetched by previous \texttt{oid},
\texttt{rrd}, \texttt{status} and \texttt{eval} commands with arbitrary perl code.
It looks like:
\begin{verbatim}%
eval VARNAME PERLEXPRESSION\end{verbatim}
The \texttt{PERLEXPRESSION} is a perl expression and can be arbitrarily complex,
but gets messy quickly with the \texttt{datapage.cgi} and perl both doing variable
interpolation.
Note: \texttt{datapage.cgi} uses private.pl (see the private.pl section) , so you can include commonly
used functions here to make your datapage creation easier.
\item debug
The \texttt{debug} command takes a number which is the level to set debugging to.
It causes extra output which may be helpful in figuring out why your page
isn't working the way you expected.
\item alertstatus
This lets you fetch the alert level for a given (host, rrd, dsname, cf)
combination. The command looks like:
\begin{verbatim}%
alertstatus VARNAME HOSTNAME RRDNAME DSNAME [CF]\end{verbatim}
This will fetch the alert status and put it in the datapage variable \texttt{VARNAME}.
The status will be the same set of values shown on the alerts report (see the alerts report section)
for status.
The \texttt{CF} parameter is optional and is rrdtool's consolidation function.
It will be set to \texttt{AVERAGE} if it's not supplied.
\item alertvalue
This is the same as \texttt{alertstatus} except that it sets the variable to the
current value of the (host, rrd, variable, cf) combination.
\end{itemize}
\subsection{The HTML template}%
\index{The HTML template}
This is almost just HTML with a few magic cookies inserted. The difference is
that the beginning must include HTTP headers. If you don't want anything
fancy, just begin like:
\begin{verbatim}%
------ cut here ------
BEGIN-PAGE
content-type: text/html\end{verbatim}
\begin{verbatim}%
------ cut here ------\end{verbatim}
Note: the empty line after \texttt{content-type:} is \textbf{not} optional. It's
necessary to end the HTTP headers.
The magic cookies are:
\begin{itemize}
\item $<$DATAPAGE::STATUS host statusfile$>$
inserts a specified status file
\item $<$DATAPAGE::VAR varname$>$
interpolates the value of a datapage variable
\item $<$DATAPAGE::HEADER title$>$
generates a standard remstats header
\item $<$DATAPAGE::STATUSHEADER hostname$>$
generates the status headers for the named host
\item $<$DATAPAGE::TOOLBAR hostname$>$
generates the toolbar for the named host
\item $<$DATAPAGE::FOOTER$>$
generates a standard remstats footer
\item $<$DATAPAGE::INCLUDE filename$>$
include the contents of a file from the datapage directory, for imagemaps ...
\item $<$DATAPAGE::PATHINCLUDE filename-with-path$>$
include contents of a file specified with a complete path
\item $<$DATAPAGE::MACRO macroname [argvalue] ...$>$
include boilerplate HTML with substitutions
\item $<$DATAPAGE::GRAPH host rrd graph time$>$
generate the specified remstats graph
\item $<$DATAPAGE::CUSTOMGRAPH graph time$>$
generate the specified remstats customgraph
\item $<$DATAPAGE::ERRORS$>$
inserts the text of errors encountered in generating the page. Without
this one, you won't see any errors. That way you include the errors and
debugging output (see next item), which you're creating/debugging the
datapage and afterwards turn them off. The errors and debugging output
may include information you don't want to reveal to outsiders. Also,
collecting all the error output together avoids spoiling the formatting
of the page.
\item $<$DATAPAGE::DEBUG$>$
inserts debugging output. Without it, you won't see any debugging output.
\end{itemize}%------------------------------------ graph-cgi.pod ---
\section{graph.cgi - exporting remstats graphs}%
\index{graph.cgi - exporting remstats graphs}
The purpose of \texttt{graph.cgi} is to allow remstats graphs
to appear on external (not part of remstats) web-pages.
It's \textbf{not} efficient, with the graphs being generated whenever
the page is reloaded, but it is portable. All you do is to
create an $<$IMG SRC...$>$ tag with the appropriate values, like:
\begin{verbatim}%
$<$IMG SRC="http://remstats.sourceworks.com:1954/graph.cgi?host=aaa\&rrd=bbb\&graph=ccc\&graphtime=ddd"$>$\end{verbatim}
and replace \texttt{aaa} with the name of the host, as remstats knows it, \texttt{bbb}
with the name of the RRD, \texttt{ccc} with the name of the graph within that RRD,
and \texttt{ddd} with the name of the timespan, from the
times config-file (see the times config-file section) . If the RRD is a wildcard RRD, e.g.
\texttt{snmpif-*}, then you must use the specific instance, e.g. \texttt{snmpif-eth0}.
That's all there is to it.
%------------------------------------ log-event-cgi.pod ---
\section{log-event.cgi - log events from a web-page}%
\index{log-event.cgi - log events from a web-page}
This shows up on the showlog.cgi (see the showlog.cgi section) as a link to permit
you to manually enter events into the log. Don't feel
obliged to enter all the fields. The data isn't checked
for meaning, just for syntax. In other words, a host-name
must look like a host-name, but it doesn't have to be a
real host-name.
This cgi-script ought to be protected. See the
web-server installation (see the web-server installation section) docs.
%------------------------------------ ping-cgi.pod ---
\section{ping.cgi }%
\index{ping.cgi }
The \texttt{ping.cgi} script allows you to ping a host. It's intended to be
called off a host's toolbar, but that's not required. Simply provide
the hostname or IP number and it'll ping it.
As an example, you could ping ftp.uu.net with a URL like:
\begin{verbatim}%
http://remstats.sourceworks.com:1954/ping.cgi?host=ftp.uu.net (see \textbf{http://remstats.sourceworks.com:1954/ping.cgi?host=ftp.uu.net})\end{verbatim}
%------------------------------------ showlog-cgi.pod ---
\section{showlog.cgi }%
\index{showlog.cgi }
Any alert is also logged to the remstats log files (one file per day). Other
information is also logged, for example, the topology-monitor (see the topology-monitor section) logs network
topology changes. The \texttt{showlog.cgi} script allows you to display selected
portions of the log files, by time-period, by host, ...
%------------------------------------ traceroute-cgi.pod ---
\section{traceroute.cgi }%
\index{traceroute.cgi }
This script uses traceroute (see the traceroute section) to find the path from the remstats host to
some other specified host. It's intended to be called off a host's
toolbar, but that's not required. For example, you could trace the path from
here (trevelyan.sourceworks.com) to ftp.uu.net with a URL like:
\begin{verbatim}%
http://remstats.sourceworks.com:1954/traceroute.cgi?host=ftp.uu.net (see the http://remstats.sourceworks.com:1954/traceroute.cgi?host=ftp.uu.net section) \end{verbatim}
There are other options that you can specify:
\begin{itemize}
\item \texttt{no\_names} - just shows IP numbers instead of looking up the domain-names
\item \texttt{ASNs} - look up the Autonomous System Numbers (ASNs) for the IP number of each hop. It
can be useful for figuring out which networks you are traversing.
\item \texttt{owners} - look up the "owner" via SOA records
\item \texttt{fast} - continues on to the next hop as soon as the current one answers
\end{itemize}%------------------------------------ whois-cgi.pod ---
\section{whois.cgi }%
\index{whois.cgi }
This script talks to the ARIN whois database (by default) to look up network
names, IP numbers and AS numbers. It's usually linked into the results of
traceroute.cgi (see the traceroute.cgi section) so that you can look up what your
traceroute results actually mean.
Try \texttt{traceroute.cgi} if you want to see how it works.
%------------------------------------ do-traceroutes.pod ---
\chapter{do-traceroutes}
\index{do-traceroutes}
\section{do-traceroutes - find the path to each host}%
\index{do-traceroutes - find the path to each host}
This runs traceroute against each host being monitored. After
they've all finished, it runs the topology-monitor (see the topology-monitor section) .
I'm planning to make graphical representations of how you are
connected to the hosts you're monitoring, but that's not working yet.
%------------------------------------ traceroute.pod ---
\section{traceroute }%
\index{traceroute }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
2.9.2tee0.0: Usage: traceroute [-adnruvAMOQ] [-w wait] [-S start_ttl] [-m max_ttl] [-p port#] [-q nqueries] [-g gateway] [-t tos] [-s src_addr] [-g router] host [data size]
-a: Abort after 10 consecutive drops
-d: Socket level debugging
-g: Use this gateway as an intermediate hop (uses LSRR)
-S: Set start TTL (default 1)
-m: Set maximum TTL (default 30)
-n: Report IP addresses only (not hostnames)
-p: Use an alternate UDP port
-q: Set the number of queries at each TTL (default 3)
-r: Set Dont Route option
-s: Set your source address
-t: Set the IP TOS field (default 0)
-u: Use microsecond timestamps
-v: Verbose
-w: Set timeout for replies (default 5 sec)
-A: Report AS# at each hop (from GRR)
-M: Do RFC1191 path MTU discovery
-O: Report owner at each hop (from DNS)
-P: Parallel probing
-Q: Report delay statistics at each hop (min/avg+-stddev/max) (ms)
-T: Terminator (line end terminator)
-U: Go to next hop on any success
\end{verbatim}
\subsection{Description: }%
\index{Description: }
Hmm. I think that describes its use pretty well. What does it do?
Oh. Well it sends UDP packets with the time-to-live set to 1, then 2
then 3 and so on. This causes the routers that these packets are sent
through to complain after the requisite number of hops. I.E. the first router
complains about the first packets, with TTL set to one, the second about
the packets with TTL set to two etc. Traceroute catches the complaints
and times how long it took. This not only shows you how your packets
are getting to the destination, but sometimes, where the congestion
is as well. There's a lots better explanation in the source, so
if you want more,
UTSL (see \textbf{http://www.tuxedo.org/\~{}esr/jargon/html/entry/UTSL.html}).
This version of traceroute is used in traceroute.cgi (see the traceroute.cgi section) ,
which isn't required,
just handy on occasion, and in do-traceroute (see the do-traceroute section) , which you don't need unless
you're curious about your routing and how it's changing over time. The only
extra options that do-traceroute uses are the \texttt{-A} option to look up the
ASN (Autonomous System Number) and the \texttt{-O} option to look up the DNS owner.
%------------------------------------ misc.pod ---
\chapter{Miscellany}
\index{Miscellany}
\section{Miscellaneous Scripts}%
\index{Miscellaneous Scripts}
These are scripts that don't really fit in anywhere.
\begin{itemize}
\item availability-report (see the availability-report section) shows availability of RRD variables
\item genindex (see the genindex section) makes an index
\item genmenu (see the genmenu section) makes the vertical menu-bars used in these docs.
\item htmlpod (see the htmlpod section) makes pod files from html files (roughly).
\item podhtml (see the podhtml section) makes html files from pod files.
\item podlatex (see the podlatex section) makes LaTeX files from pod files.
\item podpdf (see the podpdf section) makes PDF files from pod files.
\item rrd-report (see the rrd-report section) produces reports from a raw rrd.
\end{itemize}
These are release-related scripts:
\begin{itemize}
\item convert-config-links (see the convert-config-links section) - copies links to files (just read it)
\end{itemize}%------------------------------------ availability-report.pod ---
\section{availability-report }%
\index{availability-report }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
availability-report version 1.19
usage: availability-report [options]
where options are:
-c use colors in the output
-d ddd set debugging output to level 'ddd'
-f fff set config-dir to 'fff' [/home/remstats/etc/config]
-h show this help
-H HHH show only hosts HHH (comma-separated list) [all]
-G GGG show only groups GGG (comma-separated list) [all]
-R RRR show only rrds RRR (comma-separated list) [all]
-g show group summary
-t ttt availability for time-period ttt (start,finish)
\end{verbatim}
\subsection{Description: }%
\index{Description: }
This is mainly intended to be called from
availability-report.cgi (see the availability-report.cgi section) . It provided
a report on "availability" of specified RRD variables, by default, all
that have definitions in the availability (see the availability section)
config-file. Exactly what it means for a variable to be "available"
is up to you. It's intended to give some measure of when a host or
service isn't useable, so, e.g. the default definition of availability
for the \texttt{ping} RRD variable \texttt{rcvd} (number of ping responses received)
is:
\begin{verbatim}%
ping rcvd MINIMUM $>$ 0\end{verbatim}
In english, the \texttt{rcvd} variable is considered unavailable if:
\begin{verbatim}%
- it is less than or equal to zero (I.E. it didn't respond to ping)
- there is no data available for that time period\end{verbatim}
\textbf{N.B.:} The interaction between rrd archive consolidation and the xff value,
(see rrdcreate), can result in longer periods of unavailability or conversely,
masking periods of unavailability. Choose the consolidation function carefully
to make sure you're getting the best data possible.
%------------------------------------ cleanup.pod ---
\section{cleanup - removes stale, old files}%
\index{cleanup - removes stale, old files}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
cleanup version 1.4
usage: ../cleanup [options]
where options are:
-d nnn enable debugging output at level 'nnn'
-f fff use 'fff' for config-dir [/home/remstats/etc/config]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
It removes old collector data from /home/remstats/data/LAST, old logs from
/home/remstats/data/LOGS, old traceroute data from /home/remstats/data/TRACEROUTES
and old images from all the host subdirectories of /home/remstats/html.
Run it out of cron every now and then, say once a day, with a line
like:
\begin{verbatim}%
0 2 * * * /home/remstats/bin/cleanup\end{verbatim}
%------------------------------------ convert-config-links.pod ---
\section{convert-config-links }%
\index{convert-config-links }
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
usage: ../convert-config-links [-h]
\end{verbatim}
\subsection{Description: }%
\index{Description: }
The problem is that an upgrade installation of remstats will overwrite the config-base
directory, but previous installations of remstats created new configuration directories
as symlinks to config-base. Some of these files need to be changed and some are commonly
changed, specificly:
\begin{verbatim}%
alerts alert-destination-map general html links tools\end{verbatim}
Installing a new version of remstats will overwrite config-base, including these files.
\texttt{convert-config-links} is a conversion tool for upgrading from remstats versions
before 1.0A. It will convert the commonly changed config-files from symlinks to
copies of the appropriate files from config-base. In remstats versions after 1.0A
the new-config (see the new-config section) program will "Do the Right Thing" (TM) and make copies by itself,
so you'll only have to run this once.
If you're installing remstats for the first time, you can ignore this program.
%------------------------------------ genindex.pod ---
\section{genindex - make an index from output of podhtml}%
\index{genindex - make an index from output of podhtml}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
genindex version 1.4
usage: genindex [options] file ...
where options are:
-d enable debugging output
-f fff use 'fff' format for output (html, pod or text)[pod]
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{genindex} reads the index output of podhtml (see the podhtml section) and builds a crude index.
It's mainly for using your browser's \texttt{find} command on and it's not pretty.
Show me something simple and better and I'll use it.
%------------------------------------ genmenu.pod ---
\section{genmenu - generate a collapsing menu}%
\index{genmenu - generate a collapsing menu}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
genmenu version 1.5
usage: ../genmenu [options] pagename menufile
where options are:
-d enable debugging output
-h show this help
\end{verbatim}
\subsection{Description: }%
\index{Description: }
\texttt{genmenu} reads the menu description file and generates a vertical
menu-bar, collapsed according to which pagename you gave it. This
requires all the documentation to be rebuilt whenever you change the
menu definition, but avoids having to use JavaScript.
I couldn't find a simple stand-alone program that did this, so here you are.
It doesn't require remstats.
\subsection{The Menu Definition File}%
\index{The Menu Definition File}
The file has a simple format. Blank lines and lines beginning with '\#'
are ignored. The other lines look like:
\begin{verbatim}%
[tabs]pagename [page title]\end{verbatim}
The number of \texttt{tabs} shows the level of sub-menus, making the definition
file easy to grasp at a glance. Note that the \texttt{tabs} are actual tab characters.
The \texttt{page title} (optional) is what shows up in the menu, while the
\texttt{pagename} is used to make the URL to link to. If the page name is
\texttt{xyz}, then a link to \texttt{xyz.html} is produced. If the \texttt{page title} is missing,
then the \texttt{pagename} is used instead.
%------------------------------------ htmlpod.pod ---
\section{podhtml - convert HTML to POD}%
\index{podhtml - convert HTML to POD}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
htmlpod htmlfile $>$podfile\end{verbatim}
\subsection{Description: }%
\index{Description: }
Quick and Dirty.
I only wrote it to do the first cut for converting my old html-based
documentation to pod format. It's by no means complete or even
correct. However, it did convert over 90\% of the HTML markup that
I was using.
Use it if you want, but don't complain about it without providing a
patch to fix your complaint.
%------------------------------------ podhtml.pod ---
\section{podhtml - translate a POD file to HTML}%
\index{podhtml - translate a POD file to HTML}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
podhtml version 1.5
usage: podhtml [options] podfile
where options are:
-d ddd enable debugging output
-h show this help
-s sss use 'sss' as the suffix for html files [.html]
-u uuu use 'uuu' as a URL prefix
\end{verbatim}
\subsection{Description: }%
\index{Description: }
See the docs for \texttt{pod2html}. The only changes that I made
intentionally from how \texttt{pod2html} does things are:
\begin{itemize}
\item if a line looks blank it's treated as blank. I prefer
to avoid surprises.
\item I added a new \texttt{=exec} which executes a command line and
inserts the output of stdout into the resulting HTML as a $<$PRE$>$
section. This was so that I could get the latest usage message
from programs inserted without having to run each program separately,
save its output in a file, and manually insert the file into
the POD file.
\item I also caused it to append to a file called \texttt{podhtml--rawindex}
for each =head1 and =head2, a URL for that page and section and the
contents of that =headN. This is used by genindex (see the genindex section) to make an index.
\end{itemize}
I wrote this version in frustration with the way pod2html does
links. Or doesn't. I could never tell without trying whether
it would generate a link or not.
%------------------------------------ podlatex.pod ---
\section{podlatex - translate a POD file to LaTeX}%
\index{podlatex - translate a POD file to LaTeX}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
podlatex version 1.4
usage: podlatex [options] podfile
where options are:
-d ddd enable debugging output
-h show this help
-s sss use 'sss' as the suffix for html files [.html]
-u uuu use 'uuu' as a URL prefix
\end{verbatim}
\subsection{Description: }%
\index{Description: }
See the docs for \texttt{pod2latex}. The only changes that I made
intentionally from how \texttt{pod2latex} does things are:
\begin{itemize}
\item if a line looks blank it's treated as blank. I prefer
to avoid surprises.
\item I added a new \texttt{=exec} which executes a command line and
inserts the output of stdout into the resulting HTML as a $<$PRE$>$
section. This was so that I could get the latest usage message
from programs inserted without having to run each program separately,
save its output in a file, and manually insert the file into
the POD file.
\item I changed the text wrapping links.
\end{itemize}
I wrote this version in frustration with the way pod2latex does
links.
%------------------------------------ podpdf.pod ---
\section{podpdf - translate a POD file to pdf}%
\index{podpdf - translate a POD file to pdf}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
../podpdf
[ --help --verbose <1|2 --paper <usletter> --podfile <file> ] <file>
--help
displays this explanation of correct usage
--vebose <1|2>
regulates the volume of progress comments: argument must be 1 or 2
--podfile <file>
supplies the input file to process as an explicit parameter. The
input file may also be supplied from STDIN or from the command
line as the array element --paper.
Further information can be found in the POD section of Pod.pm. Enter:
perl -e "use Pod::Pdf; pod2pdf('<your_library_path>/Pod/Pdf.pm')"
to get the POD in PDF format :)
\end{verbatim}
\subsection{Description: }%
\index{Description: }
See the docs for \texttt{Pod::PDF}. This is only a tiny wrapper for it.
%------------------------------------ rrd-report.pod ---
\section{rrd-report - display summaries of an RRD file}%
\index{rrd-report - display summaries of an RRD file}
\subsection{Usage: }%
\index{Usage: }
\begin{verbatim}%
rrd-report version 1.6
usage: rrd-report [options] rrd-file
where options are:
-b bbb begin at time 'bbb' (see Note 3)
-c ccc select data from consolidation-function 'ccc' [AVERAGE]
-d ddd enable debugging output at level 'ddd'
-D DDD show the dates as 'DDD' [both,pretty]
(none|both|start|finish),(raw|simple|pretty)
-e eee end at time 'eee' (see Note 3)
-f fff use report format 'fff' [simple]
(from 'simple', 'label', 'html')
-h show this help
-i iii report on intervals 'iii' (see Note 2) [1d]
-l list the DS names in this rrd, no report
-n nnn use 'nnn' as the format to print the numbers [%lf]
-s sss summary on interval 'sss' (see Note 2) [1w]
-v vvv show variables 'vvv' (var:cf comma-separated) [ALL]
Note: if report interval (-i) and summary interval (-s) are equal,
no summary reporting is done.
Note 2: intervals are numbers of seconds, minutes, hours, days, weeks, Months
or years. E.G. "4w" for 4 weeks, "1M" for one month.
Note 3: Begin and End times are a unix timestamp (seconds since Jan 1, 1970) or
plus-or-minus an interval, as in Note 2. E.G. "-1w" means "one week ago".
\end{verbatim}
\subsection{Examples: }%
\index{Examples: }
I hope that the above is enough to use it after seeing a few examples.
Here's the equivalent of the command that created the RRD for the example.
rrdtool create ping.rrd $\backslash$
DS:sent:GAUGE:600:0:10 $\backslash$
DS:rcvd:GAUGE:600:0:10 $\backslash$
DS:min:GAUGE:600:U:U $\backslash$
DS:avg:GAUGE:600:U:U $\backslash$
DS:max:GAUGE:600:U:U $\backslash$
RRA:AVERAGE:0.1:1:600 $\backslash$
RRA:AVERAGE:0.1:7:300 $\backslash$
RRA:AVERAGE:0.1:30:300 $\backslash$
RRA:AVERAGE:0.1:90:300 $\backslash$
RRA:AVERAGE:0.1:365:300 $\backslash$
RRA:MIN:0.1:1:600 $\backslash$
RRA:MIN:0.1:7:300 $\backslash$
RRA:MIN:0.1:30:300 $\backslash$
RRA:MIN:0.1:90:300 $\backslash$
RRA:MIN:0.1:365:300 $\backslash$
RRA:MAX:0.1:1:600 $\backslash$
RRA:MAX:0.1:7:300 $\backslash$
RRA:MAX:0.1:30:300 $\backslash$
RRA:MAX:0.1:90:300 $\backslash$
RRA:MAX:0.1:365:300
See "man rrdcreate" for an explanation for the command itself. The
fields are:
\begin{itemize}
\item sent/rcvd - number of ping packets sent/received
\item min/avg/max - the round-trip-time (min, average and max) for the pings
\end{itemize}
Here's a default report from one of my ping RRDs:
\begin{verbatim}%
\%rrd-report ping.rrd
[snip]
data 1999-10-25 17:20:49 1999-10-26 17:20:49 10.000000 10.000000 10.000000 9.864444 9.899444 9.934444 41.569556 42.210222 42.850889 45.626889 45.758833 45.890778 50.955111 51.253056 51.551000
data 1999-10-26 17:20:49 1999-10-27 17:20:49 10.000000 10.000000 10.000000 9.934444 9.987819 10.000000 39.536556 41.498103 46.124938 42.166222 45.386889 49.370000 50.955111 52.435532 54.635926
summary 1999-10-19 17:20:49 1999-10-26 17:20:49 10.000000 10.000000 10.000000 9.331111 9.932391 10.000000 38.317778 42.750146 48.318444 41.500556 46.736223 50.265778 49.122444 52.261605 59.867778
[snip]
data 1999-11-17 16:20:49 1999-11-18 16:20:49 10.000000 10.000000 10.000000 8.000000 9.934245 10.000000 36.400000 46.585421 50.000000 41.256667 49.592941 58.640000 49.036667 53.837716 117.240000
summary 1999-11-16 16:20:49 1999-11-18 16:20:49 10.000000 10.000000 10.000000 9.142857 9.929788 10.000000 38.285714 46.482773 50.000000 47.148095 49.503659 51.294286 50.000000 53.124615 65.880952
overall 1999-10-19 17:20:49 1999-11-18 16:20:49 9.978889 9.999918 10.000000 1.323333 9.876767 10.000000 6.194333 45.631272 76.394778 6.516556 48.842746 107.286556 6.971111 54.319770 179.554222\end{verbatim}
Each "data" line is a report for the interval covered by the two
timestamps, (by default one day). The values are the requested
(or in this case all) DS:CF combinations. The "summary" lines are
just reports over a longer interval (by default one week).
The "overall" line is for the whole selected time-period.
Hmm. There's much too much there. What I'd really like to see is
just the interesting stuff. I know how many pings I'm sending
during this period (10), so drop that and just show the minimum min
average avg and maximum max:
\begin{verbatim}%
\% rrd-report -v rcvd:AVERAGE,min:MIN,avg:AVERAGE,max:MAX
data 1999-10-19 17:54:57 1999-10-20 17:54:57 9.820267 38.317778 43.948411 55.716667
data 1999-10-20 17:54:57 1999-10-21 17:54:57 9.966716 39.303333 46.180111 59.867778
data 1999-10-21 17:54:57 1999-10-22 17:54:57 9.907440 40.469000 48.496274 56.022222
data 1999-10-22 17:54:57 1999-10-23 17:54:57 9.977827 40.232333 47.571133 54.475062
[snip]
summary 1999-11-09 16:54:57 1999-11-16 16:54:57 9.950836 39.310056 52.578943 179.554222
data 1999-11-17 16:54:57 1999-11-18 16:54:57 9.934164 36.400000 49.606736 117.240000
summary 1999-11-16 16:54:57 1999-11-18 16:54:57 9.928672 38.285714 49.489729 65.880952
overall 1999-10-19 17:54:57 1999-11-18 16:54:57 9.876767 6.194333 48.842746 179.554222\end{verbatim}
Well, I can figure out when the period ended, so leave out the end-time, and
I don't like seeing all those meaningless (in this case) decimal places, so
how about:
\begin{verbatim}%
\% rrd-report -D start,pretty -n \%.1lf -v rcvd:AVERAGE,min:MIN,avg:AVERAGE,max:MAX
[snip]
data 1999-11-14 17:27:04 10.0 40.0 49.4 88.7
data 1999-11-15 17:27:04 9.7 21.7 48.0 63.4
data 1999-11-16 17:27:04 9.9 38.3 49.4 61.3
summary 1999-11-09 17:27:04 10.0 39.3 52.6 179.6
data 1999-11-17 17:27:04 9.9 36.4 49.6 117.2
summary 1999-11-16 17:27:04 9.9 38.3 49.5 65.9
overall 1999-10-19 18:27:04 9.9 6.2 48.8 179.6\end{verbatim}
OK. I'd like to see the last year with a one-week interval, with no summaries.
(Setting the report-interval to the same as the summary-interval
drops summaries. You still get an overall line.)
\begin{verbatim}%
\% rrd-report -D start,pretty -n \%.1lf -v rcvd:AVERAGE,min:MIN,avg:AVERAGE,ma
x:MAX -i 1w -s 1w
data 1998-11-19 09:04:43 NODATA NODATA NODATA NODATA
[snip]
data 1999-02-25 09:04:43 9.9 45.0 55.0 64.2
data 1999-03-04 09:04:43 10.0 43.9 54.5 64.3
[snip]
data 1999-11-11 09:04:43 10.0 39.3 51.7 179.6
data 1999-11-18 09:04:43 9.9 37.4 49.7 169.7
overall 1998-11-19 09:04:43 9.6 0.0 45.3 103.7\end{verbatim}
And for those of us who like to see it on the web:
[You'll just have to look in the web version of this documentation
to see what it looks like.]
%------------------------------------ thanks.pod ---
\chapter{Thank-you's}
\index{Thank-you's}
\section{Thank-you's }%
\index{Thank-you's }
\begin{itemize}
\item Tobias Oetiker
- for MRTG and RRDtool
\item Larry Wall and the rest of the perl hackers
- for making perl into the "swiss-army-chainsaw" of programming languages
\item Vikas Aggarwal
- for multiping from NOCOL, so pinging doesn't take forever
\item Ehud Gavron and others
- for the NANOG (see \textbf{http://www.nanog.org/}) traceroute.
\item Will Maton
- numerous suggestions and encouragement
\item Adam Kennedy
- initial pre-release testing
\item Ken Filipps
- suggestion to brand remstats port-probes
\item Andrew Cochran
- for Telnet.pm open error
- for non interface-MIB interfaces suggestion, e.g. frame-relay
\item Marek Snowarski
- for installation and documentation feedback
\item Matt Duggan
- for suggestions (views and description magic-cookies)
\item Steve Francis
- for suggestions and installation feedback
\item Alexander Reelsen
- for suggesting that the df-* rrd shouldn't use K-btytes
- for the masqueraded connection count code
\item Jon Villarreal
-for pointing out the error in alert time selection
\end{itemize}
I've probably forgotten some others. Please remind me if I've forgotten
your contribution.
\printindex{default}
\end{document}
|