File: todo.pod

package info (click to toggle)
remstats 1.00a4-8woody1
  • links: PTS
  • area: main
  • in suites: woody
  • size: 4,576 kB
  • ctags: 1,020
  • sloc: perl: 11,706; ansic: 2,776; makefile: 944; sh: 869
file content (344 lines) | stat: -rw-r--r-- 12,914 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
=cut

TITLE=todo
DESCRIPTION=to-do list for remstats development
KEYWORDS=todo
DOCTOP=index
DOCPREV=bugs
DOCNEXT=faq
# Last is 134

=pod

=head1 To-Do List for Remstats


=for html <P><HR>

=for text
-------------------------------------------------------------------

=for html <P><HR>

=head1 High Priority

B<134 20010829 [LOW]> - make header_bar (in htmlstuff) do the link making, if available
and fix whatever uses it not to.

B<133 20010829 [LOW]> - add an option to make nt-discover update old hosts with a standard set of RRDs, 
even if the hosts are already known

B<132 20010824 [HIGH]> - BUG: get rid of the spikes in uptime from the unix-status-server

B<131 20010824 [MED]> - make status pages for each host, group and for all hosts using
the new alertstatus and possibly alertvalue.

B<130 20010823 [HIGH]> - add an <RRD::EXEC ...> tag to rrgcgi.  To by used in host index
pages (see 129).

B<128 20010629 [MED,HOLD]> - custom, configuration-supplied info per rrd which is simply available
wherever it makes sense, e.g. in alerts.

- first make sure someone has a use for it.

B<127 20010622 [MED]> - graph data together with historical data.  This
will probably mean either populating another rrd with historical averages, 
temporarily or permanently, or modifying rrdtool.  The former is certainly
simpler to do, given my knowledge of the internals of rrdtool.  However,
it needs to have another rrd for each period?  Need to keep the same data
over some longer period, a multiple of the period of interest, as well as
the averages, from period to period.

B<122 20010330 [HIGH]> - rrd prog-* which tells if a particular named process is running, using
the ps section of the unix-status-collector.

B<121 20010202 [MED,HOLD]> - how about an discovery program, to find and identify hosts and
then run the appropriate new-xxx-hosts scripts to add them?

DONE 20010608 - nt-discover to find and add NT boxen

B<115) 20001229 [HIGH]> - need docs on errors.  Specifically, when 
run-remstats kills a collector for taking too long.  And where to find 
the output of the killed collector.

B<112) 20001212 [LOW,HOLD]> - web-based remstats configurator.  Needs
to consider security, at least from the point of view that you don't 
want to lose your configuration.  The most important part is hosts.
A lot of the rest doesn't have to be changed, or only once.

B<111) 20001212 [LOW,HOLD]> consider grafting on (at least links to) 
some kind of system configuration interface. For configuring the
mmonitored entities, not remstats.

B<110) 20001212 [LOW,HOLD]> consider problem-fixing interface.  It'd be nice to
try to fix things if there is a known way to do so.  A simple kludge
would be to add another method to the alert-destination-map which
deals with problems that it knows about, possibly invoking plugins for
specific alerts.

B<109) 20001212 [MED]> nt-log-collector, with modules for event-logs
and ntmail logs.

B<70) 20000407 [HIGH]> CGI scripts need to have a way to deal with
alternate config-files, and graph-writer needs to tell them if they
can't work it out themselves.  Otherwise, people need to be told to
do multiple installs of the CGI scripts, which might be the best way.

	make install-cgis CONFIGDIR=config-xxx

Not that painfull, but wastefull and makes upgrade messier.

- I don't like the multiple-install method, but any other method needs 
a way of getting configuration information into the CGI scripts.  Any
method which passes info in via the URL or form fields is out: too unsafe.
The only other method I can think of is to read a configuration file in the
same directory as the CGI script.  This ought to be safe from modification,
or your web-site is waiting to be mutilated.  The other part to consider is
whether any part of the info in the CGI config-file is sensitive.  I.E. do
we have to protect it in some way.

- Configuration file in the same directory won't work either, you'd still
have to install the cgi's multiple times.  I'm starting to think that
multiple installations may be the only safe thing to do.

B<99) 20000619 [HIGH]> make unix-status-collector send the directories
that we want df for and make unix-status-server do "df /dir1 /dir2"
to get them, and pull them off one line at at time.  This is to deal with
things like disconnected NFS-mounted directories hanging df when we do
just a bare "df".

B<86) 20000419 [HIGH]> trends analysis

B<87) 20000419 [HIGH]> alerts based on trends analysis and historical
data, like one-week average and standard-deviation, ... (for Steve)

B<106) 20000922 [MEDIUM]> make a file-collector.  Similar to the
log-collector, only for small, local files.  Slurp the file into 
memory, match patterns and pull out values.  The data line in an
rrd definition would be like:

	source file
	data VARNAME	GAUGE:600:0:U FUNCTION PATTERN(WITH)PARENS

in fact, this would share so much code with the log-collector that it
might be worth combining the two.  This allows collection from things
like Linux's /proc.

B<98) 20000619 [MEDIUM]> add group index files and store hosts under
group directories.  For easier application of access-controls. (for Florian)

B<2) ???????? [MEDIUM-INPROGRESS]> make rrd munger, like copyrrd was supposed to be
use dump/restore and process the xml form (rrddump-munger)
what functions do we need?  Make one script for each function.

=over 4

=item  add a DS (less important, as we can just make a new rrd)

=item  remove a DS (less important, as we can ignore it)

=item  add an archive

=item  extend an archive

=item  change CF of an archive

=item  remove an archive

=item  filter data within an archive 

=over 4

=item  change NaN to number/max/min

=item  change ># to NaN/max/min

=item  change <# to NaN/max/min

=back 4

=back 4

=for text
-------------------------------------------------------------------

=head1 Lower Priority

B<102) 20000912 [LOW]> add see-also to host config, which will
materialize links in the host header.  Config line like:

	seealso	host:xyzzy http://www.somewhere ftp://ftphost

the special "host:" pseudo-URL gets changed to a link to the
remstats page for that host.

B<103) 20000915 [MEDIUM]> make-path doesn't work with non fqdn hosts
Make it read the configuration, so it can look up the IP number in
the host config and use that if it's defined.  Otherwise, default to
gethostbyaddr.

B<107) 20000922 [MEDIUM]> extra status header lines for hosts, from
specified STATUS files creaded by the various collectors.  Add
lines to host definition like:

	extrastatus "STATUS DESCRIPTION" STATUS-FILE-NAME

B<60) 20000328 [MEDIUM]> replace route-collector with something which
scales.  SNMPwalking bgp4PathAttrBest doesn't scale to large Internet
routers with 400 peers, taking over an hour to complete. (see also 61)

- look at a script to follow the output of zebra.  That's a lot of
overhead though.  Easy if zebra is solid.

- How difficult can it be to make a native BGP listener?  I'm not clear on
the protocol, but it doesn't look too bad.

B<45) 20000121 [MEDIUM]> make snmp-collector send only one packet per host

- test and make sure that we do get back whatever succeeded.  I vaguely
remember that it didn't work.  [Later: at least under UCD snmp under linux,
if an item isn't implemented in the MIB, you get back NOTHING.  Specifically,
look for the non-unicast packet counters as well as something else; you get 
nothing back.  This isn't good.]

- have to re-write snmp-collector completely, which isn't that bad an idea.
This means a two-pass structure.  On pass one, we construct the complete query
and then send it.  On pass two, we examine all the results and format them.

B<9) ???????? [MEDIUM-TESTING]> make alerts take connectivity dependence into account

- add "via" line to host section to deal with hubs and switches [DONE]

- I think it's done.  See what happens next outage.

B<42) 20000114 [MEDIUM]> snmp-collector mod to allow summary data collected 
from a walk and then filtered as a single data-point.  E.G. specify a rrd "oid"
like:

	walk    count ifOperStatus = 1

would produce a count of the number of interfaces on that device that
were active (i.e. had a live device plugged into them).  Or a similar one
would let you count BGP routes, or arp addresses, ...  

- Unfortunately, from experience with the snmp-route-collector, this is
going to be slow for anything with a large number of items.

B<43) 20000114 [MEDIUM]> parallelizing the collectors, at least on a 
group basis, preferably host or group.

- collectors must accept C<-G> and C<-H> flags to request processing of
the specified group or host, respectively.  Run-remstats needs to fork 
extra processes according to a config-file line, "parallel group" or 
"parallel host".

- 20010831 TEE - implemented -H flags for all collectors except for the
remoteping-collector, which I'm not using anyway right now.

B<51) 20000216 [LOW]> need a way to specify URL for port-http. The root page 
doesn't always exist.

B<37) 19991216 [LOW]> traceroute sometimes shows incorrect routing, which 
confuses the topology-monitor, causing false positives

B<50) 20000215 [LOW]> make inventory script.  Runs uname 
(for hardware and software), C<ifconfig -a>, C<netstat -nr>, C<hostname>
and any others I can think of to collect configuration info.  
Then figures out the versions of important software, e.g. run C<perl -v>,
C<gcc -v ...>  Make a subdir to put it in and make a tool definition to get it
onto the host pages.

- looks like the beginning of a discovery script.

B<62) 20000329 [LOW]> make different markers for different levels
of alert on quick-index.

B<69) 20000406 [LOW]> is there any use for write_environment in 
check-config?

B<97) 20000616 [LOW]> make port-collector or check-config complain about
having a script with ok/warn/error/critical patterns but no send string.
The port-collector will ignore patterns unless there is a send string.

=for html <P><HR>

=for text
-------------------------------------------------------------------

=head1 On Hold

Usually waiting for next major release, or trapped by something else.
(in priority order)

B<40) 20000104 [MEDIUM-HOLD]> consider some form of access-control for servers

- hash-based "password"

- ssl tunneling ought to work for everything except SNMP

- what does this buy?  With the various servers run under tcp_wrappers
an attacker must either gain access to the remstats collector 
machine or spoof a tcp session from them.  If you've been "owned" 
you've got bigger problems.  If the attacker spoofs a session with 
a remstats server, tcp-wrappers will insist that it must come from 
one of the allowed hosts, so that's where the stolen output will go.  
This is only usefull to the attacker if they have access to the 
remstats collector machine or if they can sniff the traffic between 
the collector and the server.  The only data loss possible is with
the log-server which keeps state.  (Ignoring DOS attacks which are
always a problem.)

- unless someone needs this, it's on hold

B<6) ???????? [LOW-NEEDS:2-HOLD]> increase CA3 resolution

- need rrd munger (2)

B<10) ???????? [LOW-INPROGRESS-HOLD]> make graph of connectivity 

B<13) ???????? [LOW-INPROGRESS-HOLD]> snmp trap listener to update status files

- needs filter to be usefull [DONE]

- I haven't seen any useful traps so this is on hold.

B<14) ???????? [LOW-NEEDS:2-HOLD]> make rrd structural changes in config file 
get applied to the rrds.

- some taken care of with snmpif-setspeed, but need a more general solution

- look at new XML output of rrddump

B<39) ???????? [LOW-HOLD]> make RRD dumper, to put data out in a form that can 
be loaded into a database

- I don't need it, per se, but it might be easier than writing the 
availability report generator.

B<52) 20000215 [LOW]> make a makegraph.cgi, or whatever, that will let you 
make a somewhat custom graph on the fly.  makegraph.cgi by itself will list 
all the hosts and let you choose one.  makegraph.cgi?host=xxx will list 
all the RRDs for this host and let you choose ?one?.  
makegraph.cgi?host=xxx&rrd=yyy will list the various DSs for this RRD and 
let you choose the ones you want.  Then you get to define any CDEFs needed 
and then LINEn/AREA/STACK for each DEF or CDEF desired.  And size, title, 
legends...

- On hold since L<graph.cgi|graph-cgi> will let you get at any existing graph you want.
If I find a use or need for this, I'll re-activate it.

B<92) 20000518 [HIGH]> collect traffic info from cflowd (artsportms).
Make it flexible enough that it can let you choose which ports you
want (one per rrd?).  Make a loader to load historical data.

- [DONE 20000524] artsportms-loader done

- I no longer have access to devices with this feature

=for html <P><HR>

=for text
-------------------------------------------------------------------

I've also kept the stuff that used to be here, but has already been
L<done|DONE>.