1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413
|
This is the beta test version 0.9.3 of the INN feeder program `innfeed.'
It is written in ANSI C and tries to be POSIX.1 compliant. The features of
it are:
1. Handles the STREAM extenstion to NNTP.
2. Will open multiple connections to the remote host to parallel
feed.
3. Will handle multiple remote hosts.
4. Will tear down idle connections.
5. Runs as a channel/funnel feed from INN, or by reading a funnel
file.
6. Will stop issuing CHECK commands and go straight to TAKETHIS if
the remote responds affermatively to enough CHECKs.
7. It will go back to issuing CHECKs if enough TAKETHIS commands fail.
Read the rest of this file and then read and apply the instructions in the
the INSTALL file.
Changes for 0.10.1
1. Config file inclusion now works via the syntax:
$INCLUDE pathname
Config files can be included up to a nesting depth of 10. Line
numbers and file names are not properly reported yet on errors
when includes are used, though.
2. Signal handling is tidied up a bit. If your compiler doesn't
support the ``volatile'' keyword, then see the comment in
sysconfig.h.
3. If you have a stdio library that hash limit on open files lower
then the process limit for open plain files (all flavours of
SunOS), then a new config file variable ``stdio-fdmax'' can be
used to give that upper bound. When set, all new network
connections will be limited to file descriptors over this value,
leaving the lower file descriptors free for stdio. See
innfeed.conf(5) for more details. Remember that the config file
value overrides any compiled in value.
Changes for 0.10
1. A major change has been made to the config file. The new format
is quite extensible and will let new data items be added in the
future without changing the basic format. There's a new option
``-C'' (for ``check'') that will make innfeed read the config
file and report on any errors and then exit. This will let you
verify things before locking them into a newsfeeds file entry. A
program has been added ``convertconfig.pl'' that will read your
old config off the command line (or stdin), and will write a new
version to stdout.
The new config file structure permits non-peer-specific items to
be declared (like the location of the status file, or whether to
wrap the generated status file in HTML). This is part of the
included sample:
use-mmap: true
news-spool: /var/news/spool/articles
backlog-directory: /var/news/spool/innfeed
pid-file: innfeed.pid
status-file: innfeed.status
gen-html: false
log-file: innfeed.log
backlog-factor: 1.10
connection-stats: false
max-reconnect-time: 3600
so only option you'll probably need now is the ``-c'' option to
locate the config file, and as this is also compiled in, you may
not even need that.
See the innfeed.conf(5) man page for more details on config file
structure.
2. The backlog file handling is changed slightly:
- The .output file is always kept open (until rotation time).
- The .output file is allowed to grow for at least 30
seconds (or the value defined by the key
backlog-rotate-period in the config file). This prevents
thrashing of backlog files.
- The hand-prepared file is checked for only every 600
seconds maximum (or the value defined by the key
backlog-new), not every time the files are rotated.
- The stating of the three backlog files is reduced
dramatically.
3. Signal handling is changed so that they are more synchronous
with other activity. This should stop the frequent core-dumps
that occured when running in funnel file mode and sending
SIGTERM or SIGALRM.
4. A bug related to zero-length articles was fixed. They will now be
logged.
5. More information is in the innfeed.status file, including the
reasons return by the remote when it is throttled.
6. SIGEMT is now a trigger for closing and reopening all the
backlog files. If you have scripts that need to fool with the
backlogs, then have the scripts move the backlogs out of the way
and then send the SIGEMT.
Changes for 0.9.3
1. If your system supports mmap() *and* if you have your articles
stored on disk in NNTP-ready format (rare), then you can have
innfeed mmap article data to save on memory (thanks to Dave
Lawrence). There is an important issue with this:
if you try to have innfeed handle too many articles (by
running many connections and/or high max-check values in
innfeed.conf) at once, then your system (not your process)
may run out of free vnodes (global file descriptors), as a
vnode is used as long as the file is mmaped. So be careful.
If your articles are not in NNTP format then this will be
noticed and the article will be pulled into memory for fixing up
(and then immediately munmap'd). You can disable use of MMAP if
you've built it in by using the '-M' flag. I tried mixing
mmap'ing and articles not in NNTP format and it was a real
performance loss. I'll be trying it differently later.
2. If innfeed is asked to send an article to a host it knows
nothing about, or which it cannot acquire the required lock for
(which causes the "ME locked cannot setup peer ..." and "ME
unconfigured peer" syslog messages), then innfeed will deposit
the article information into a file matching the pattern
innfeed-dropped.* in the backlog directory (TAPE_DIRECTORY in
config.h). This file will not be processed in any manner -- it's
up to you to decide what to do with it (wait for innfeed to exit
before doing anything with it, or send innfeed a SIGHUP to get
it to reread its config file, which will roll this file).
4. The output backlog files will now be kept below a certain byte
limit. This happens via the ``-e'' option. If, after writing to
an output file, the new length is bigger than the given limit
(multiplied by a fudge factor defined in config.h -- default of
1.10) then the file will be shrunk down to this size (or slightly
smaller to find the end of line boundary). The front of the file
will be removed to do this. This means lost articles for the
entries removed.
3. A SIGHUP will make the config be reloaded.
4. The .checkpoint files have been dropped in favour of scribbling
the offset into the input file itself.
5. When the process exits normally a final syslog entry covering
all of the peers over the life of the process is written. It
looks like:
Jan 12 15:51:53 data innfeed.tester[24189]: ME global
seconds 2472 offered 43820 accepted 10506
refused 31168 rejected 1773 missing 39
6. SIGALARM now rolls the input file, rather than the log
file. This is useful in funnel file mode when you move the input
file and tell innd to flush it, then send innfeed the signal.
7. The location of the pid file, config file and status file, can
now be relative, in which case they're relative to the backlog
directory.
8. stdin stdout and stderr are initialized properly when innfeed is
started by a process that has closed them.
9. Various values in config.h have changed (paths to look more like
values used in inn 1.5 and others to support point #7 above
more easily.)
10. procbatch.pl can now 'require' innshellvars.pl that comes with
1.5. The default is not to. You nead to do a one line tweak if
you want it to. The defaults in procbatch.pl match the new
defaults of innfeed.
11. Core files that get generated on purpose will be done so in
CORE_DIRECTORY (as defined in config.h), if that is defined to a
pathname. If CORE_DIRECTORY is defined to be NULL (the default
now), then the core will be generated in the backlog directory (as
possibly modified by the '-b' option).
Changes for 0.9.2
1. Now includes David Lawrence's patches to handle funnel files.
2. EAGAIN errors on read and writes are caught and dealt with (of
interest to Solaris `victims').
3. It is now much faster at servicing the file descriptor attached
to innd. This means it is faster at recognising it has been
flushed and at dropping connections. This means fewer
conflicts with new innfeeds starting before the old one has
finished up. It is still a good net-citizen and it finishes the
commands already started, so the fast response is only as fast
as your slowest peer, but it no longer tries to send
everything it had queued internally, and locks get released much
quicker.
4. Includes Michael Hucka's patch to make the innfeed.status output
neater.
5. Includes Andy Vasilyev's HTML-in-innfeed.status patch (but you
have to enable it in config.h).
6. Added a '-a' (top of news spool) and a '-p' (pid file path)
option.
Changes for 0.9
1. Format of innfeed.conf file changed slightly (for per-peer
streaming specs).
2. Including Greg Patten's innlog.pl (taken from
ftp://loose.apana.org.au/pub/local/innlog/innlog.pl)
3. Added Christophe Wolfhugel's patch to permit a per-peer
restriction on using streaming.
4. More robust handling of peers that return bad responses (no long
just calling abort).
Changes for 0.8.5
1. Massive syslog messages cleanup courtesy of kre.
2. The innlog.awk-patch hash been dropped from the distribution
until the new syslog messages are dealt with.
3. State machine more robust in the face of unexpected responses
from remote. Connection gets torn down and bad response's
logged.
4. The fixed timers (article reception timeout, read timeout,
and flush timeout) are all adjusted by up to +/-10% so that
things aren't quite so synchronised.
5. The innfeed.status file has been expanded and reformatted to
include more information.
Changes for 0.8.4
1. A change in the handing off of articles to connections in order to
encourage connections that were opened due to activity spikes,
to close down sooner.
2. The backlog files are no longer concatenated together at process
startup, but the .input is simply used if it exists, and if not
then the hand-dropped file is used first and the .output file
second.
3. The innfeed.status is no longer updated by a innfeed that is in
its death-throws.
4. Specifically catch the 480 response code from NNRPD when we try
to give it an IHAVE.
5. The connection reestablishment time gets properly increased when
the connection fails to go through (up to and including the
reading of the banner message).
6. Bug fix that occasionally had articles sit in a connection and
never get processed.
7. Bug fix in the counter of number of sleeping connections.
8. Bug fix in config file parsing.
9. Procbatch.pl included.
Changes for version 0.8.1
1. various bug fixes.
2. core files generated by ASSERT are (possibly) put in a seperate
directory to ease debugging are
Changes for version 0.8
1. The implicit state machine in the Connection objects has been
made explicit.
2. Various bug fixes.
Changes for version 0.7.1
1. Pulled the source to inet_addr.c from the bind distribution.
(Solaris and some others don't have it).
Changes for version 0.7
1. The backlog file mechanism has been completely reworked. There are
now only two backlog files: one for output and on for input. The
output file becomes the input file when the input file is
exhausted.
2. Much less strenuous use of writev. Solaris and other sv4r
machines have an amazingly low value for the maximum number of
iovecs that can be passed into writev.
3. Proper connection cleanup (QUIT issued) at shutdown.
4. A lock is taken out in the backlog directory for each peer. To feed
the same peer from two different instances of innfeed (with a
batch file for example), then you must use another directory.
5. Creating a file in the backlog directory with the same name as the
peer, the that file will be used next time backlog files are
processed. Its format must be:
pathname msgid
where pathname is absolute, or relative to the top of the news
spool.
6. More command line options.
7. Dynamic peer creation. If the proper command line option is
used (-y) and innfeed is to told to feed a peer that it doesn't
have in its config file, then it will create a new binding to
the new peer. The ip name must be the same as the peername,
i.e. if innd tells innfeed about a peer fooBarBat, then
gethostbyname("fooBarBat") better work.
8. Connections will be periodically torn down (1 hour is the
default), even if they're active, so that non-innd peers don't
have problems with their history files being kept open for too
long.
9. The input backlog files are checkpointed every 30 seconds
so that a crash while processing a large backlog doesn't require
starting over from the beginning.
Changes for version 0.6
1. Logging of spooling of backlog only happens once per
stats-logging period.
Bugs/Problems/Notes etc:
1. There is no graceful handling of file descriptor exhaustion.
2. If the following situation occurs:
- articles on disk are NOT in NNTP-ready format.
- innfeed was built with HAVE_MMAP defined.
- memory usage is higher than expected
try running innfeed with the '-M' flag (or recompiling with
HAVE_MMAP undefined). Solaris, and possibly other SVR4 machines,
waste a lot of swap space.
3. On the stats logging the 'offered' may not equal the sum of the
other fields. This is because the stats at that moment were
generated while waiting for a response to a command to come
back. Innfeed considers an article ``offered'' when it sends the
command, not when it gets a response back. Perhaps this should
change.
4. If all the Connections for a peer are idle and a new backlog file
is dropped in by hand, then it will not be picked up until the
next time it gets an article from innd for that peer. This will
be fixed in a later version, but for now, if the peer is likely
to be idle for a long time, then flush the process.
5. Adding a backlog file by hand does not cause extra Connections to
be automatically created, only the existing Connections will use
the file. If the extra load requires new Connections to be built
when innd delivers new articles for tranmission, then they too
will use the file, but this a side effect and not a direct
consequence. This means if you want to run in '-x' mode, then
make sure your config file entry states the correct number of
initial connections, as they're all the Connections that will be
created.
6. If '-x' is used and the config file has an entry for a peer that
has no batch file to process, then innfeed will not exit after
all batch files have been finished--it will just site there idle.
7. If the remote is running inn and only has you in the nnrp.access
file, then innfeed will end up talking to nnrpd. Innfeed will
try every 30 seconds to reconnect to a server that will accept
IHAVE commands. i.e. there is no exponential back of retry
attempt. This is because the connection is considered good once
the MODE STREAM command has been accepted or rejected (and nnrpd
rejects it).
Futures:
1. Innfeed will eventually take exploder commands.
2. The config file will be revamped to allow for more global
options etc and run-time configuration. Too much is compile-time
dependant at the moment.
3. The connection retry time will get more sophisticated to catch
problems like the nnrpd issue mentioned above.
4. Include the number of takesthis/check/ihave commands issued in
the log entries.
5. Heaps more stuff requested that's buried in my mail folders.
Any praise, complaints, requests, porting issues etc. should go to me:
brister@vix.com.
Many thanks to the following people for extra help (above and beyond the
call of duty) with pateches, beta testing and/or suggestions:
Christophe Wolfhugel <wolf@pasteur.fr>
Robert Elz <kre@munnari.oz.au>
Russell Vincent <vincent@ucthpx.uct.ac.za>
Paul Vixie <paul@vix.com>
Stephen Stuart <stuart@pa.dec.com>
John T. Stapleton <stapes@mro.dec.com>
Alan Barrett <apb@iafrica.com>
Lee McLoughlin <lmjm@doc.ic.ac.uk>
Dan Ellis <ellis@mail.microserve.net>
Katsuhiro Kondou <kondou@uxd.fc.nec.co.jp>
Marc G. Fournier <scrappy@ki.net>
Steven Bauer <sbauer@msmailgw.sdsmt.edu>
Richard Perini <rpp@ci.com.au>
Per Hedeland <per@erix.ericsson.se>
Clayton O'Neill <coneill@premier.net>
Dave Pascoe <dave@mathworks.com>
Michael Handler <handler@netaxs.com>
Petr Lampa <lampa@fee.vutbr.cz>
David Lawrence <tale@uu.net>
Don Lewis <Don.Lewis@tsc.tdk.com>
Landon Curt Noll <noll@sgi.com>
If I've forgotten anybody, please let me know.
Thanks also to the ISC for sponsoring this work.
|