File: README.READER

package info (click to toggle)
diablo 4.0-1
  • links: PTS
  • area: non-free
  • in suites: woody
  • size: 2,904 kB
  • ctags: 3,310
  • sloc: ansic: 41,366; perl: 3,152; sh: 409; csh: 84; makefile: 82
file content (495 lines) | stat: -rw-r--r-- 22,479 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495

			   DIABLO READER SUPPORT

    Diablo now has a reader module, dreaderd, which operates independantly 
    from the feeder.  It implements the full NNTP command set and is
    designed to maintain overview and an article cache based on a single 
    full or header-only feed supplied by a remote entity.  If a full feed
    is supplied, the reader module throws the body away since the feed is only
    used to generate overview records.  The reader module does NOT maintain
    a history file, so you should only give it a single feed.  If a header-only
    feed is used, it is required to includes a Bytes: header in every article
    and is required to include the entire article body for control messages. 
    The Diablo feeder is capable of outputing header-only feeds.    It is also
    possible to run the Diablo feeder on the same box, allowing the box to
    take multiple feeds and having the feeder deal with history issues and
    then push a single feed to the reader.

    You can supply a single, normal feed to the reader rather then a
    header-only feed.  Thus, the reader can also be fed from other news 
    systems.  

    THE READER CAN NOW DETECT DUPLICATE ARTICLES IN XREF SLAVE MODE!  That
    is, when you run dreaderd with the '-x xrefhost' option dreaderd will be
    able to detect duplicate articles insofar as storing them to its overview
    database goes.  Duplicate control messages will still be executed
    multiple times, however.  In a multi-level news system this allows
    the leaf to take more then one feed.  We do not recommend taking more
    then three feeds.  This should be used solely for purposes of 
    redundancy when several machines are feeding you the *same* set of 
    XRef'd articles.

    The reader uses the headers from the feed to generate and maintain its
    own overview database in the /news/spool/group/ directory, which should
    generally be a >20G partition for a full feed.  Article bodies are
    fetched from one or more remote spools and can optionally be cached
    locally in /news/spool/cache/ so re-retrievals do not continue to load
    the spool.  An alternate methodology is to turn OFF the reader's cache
    and run the Diablo feeder on the same platform, using it to filter
    duplicates and to act as a first level cache.  In this case the reader
    points it's first level remote spool to the reader running on the local
    box.  Another alternative is to simply not have the reader cache articles
    at all, instead always retrieving them from a remote host.

    Please refer to the samples/ directory for a typical feeder+reader
    setup.

    The reader does not use a history file, but it does use an active file,
    dactive.kp (which is a KP database).  Only articles matching groups listed 
    in the active file are stored to the overview directory hierarchy.

    The overview is stored in a two-tiered directory hierarchy, with the first
    tier labeled "00/" through "ff/" in hex.  dreaderd will create these
    subdirectories as required.  Each subdirectory contains an "over.*" file
    for each group and zero or more "data.*" files for each group.  The files
    are named by the history hash function on the group name.  The "over.*"
    files are fixed binary-format files containing article reference pointers
    to the "data.*" files for that group.  dexpireover's job is to maintain
    the binary format files and to delete out-of-date "data.*" files.  Each
    "data.*" file contains the headers for a number of articles relating to a
    particular group. This number of articles is dynamcially adjusted by
    dexpireover, depending on how many articles are in the group. The more
    articles a group has, the higher this fixed index will be. If a group
    reaches its 'maxarts' number (the size of the index), older articles
    will be overwritten until dexpireover adjusts the 'maxarts' value. The
    starting, minimum and maximum values for 'maxarts' can be adjusted
    with options in dexpire.ctl (see the sample for details). dexpireover
    is only able to delete whole data.* files.

				FEED/READER TOPOLOGIES

    (A) SPOOL+READER BOX (spool maintained by Diablo feeder)

	* Feeder (dbin/diablo) and Reader (dbin/dreaderd) run on same box.

	* article caching turned off on reader side

	* large spool configured (with 'reader' mode expiration) for the feeder

    (B) FEED+READER BOX (tiny spool, optional article caching)

	* Feeder (dbin/diablo) and Reader (dbin/dreaderd) run on same box.

	* Feeder is used to allow the box to take multiple feeds, but only a
	  small (8Gish or less) spool is maintained.  The feeder feeds the
	  reader on the same box with a single internal feed.

	* article caching may or may not be turned on in the reader.  Note
	  that the reader *can* use the local feeder's spool as a first or
	  second level cache if it wishes, but due to the small size of the
	  spool this may not be particularly effective.

	* Large spool configured on some remote box.

    (C) READER-ONLY BOX (no spool, optional article caching)

	* Only the reader is run on the box

	* Box may take only a single feed if not in XRef-slave mode.  The 
	  feed may not contain duplicate articles.

	* Box may take limited duplicate feeds if in XRef-slave mode 
	  (-x xrefhost) option to dreaderd.  The feed may contain duplicate
	  articles but beware that duplicate control messages are executed
	  multiple times.  It is not suggested that readers in this 
	  configuration take more then 3 feeds.

	* Article caching may or may not be turned on in the reader.

	* Large spool configured on some remote box.

    (D) READER-ONLY BOX USED AS AN L2 CACHE (as part of a larger topology)

	* Only the reader is run on the box

	* The box takes NO FEEDS and is configured with a large 
	  /news/spool/cache

	* The box accepts 'feed' connections into the reader side of the box
	  in order to take 'article <message-id>' requests and then passes
	  those requests onto the remote spool, caching the result.

	* A box in this configuration is neither a feeder nor a reader.  It
	  is simply operating as an article cache.  It cannot take user 
	  connections and it cannot generate outgoing feeds.  It can only
	  take connections from other readers that intend to fetch articles
	  by message-id from it.

	* You may specify the new 'readerxover trackonly' option in
	  diablo.config to avoid writing out overview data files and only
	  write out overview control files so expireover can run.

    (E) FEEDER-ONLY BOX TAKES AND REDISTRIBUTES A HEADER-ONLY FEED
	(as part of a larger topology)

	* Some remote feeder/full-spool box sends a header-only feed to
	  this feeder-only box.

	* This feeder-only box then masters article numbers and redistributes
	  the header-only feed to reader boxes.

	* This feeder-only box, due to not receiving full body feeds, can make
	  due with much less disk space, memory, and so forth yet still contain
	  the most critical information related to your news system:  the 
	  mastering of article numbers for all your reader boxes.

	* This feeder-only box cannot act as a backing spool and cannot push
	  out normal outgoing feeds.  It can only push out header-only feeds
	  to Diablo reader boxes and (also usually) master article numbers.

	* A reader is typically not run on the same box.  The box synchronizes
	  it's active file (sans the article numberings which it masters) from
	  one or more other reader boxes to obtain newgroup/rmgroup updates.
	  dsyncgroups is used to accomplish this.

				 ACTIVE FILE TOPOLOGIES

    This is important:  It is possible for both the feeder and the reader sides
    of Diablo to use an active file (dactive.kp), but only the reader side
    of diablo processes control messages.  The feeder side of Diablo only
    uses the active file when it is configured to do article number assignments
    and master Xref: headers.  The reader side is capable of operating in
    slave or master mode in regards to Xref: headers, and the reader side will
    also process control messages and keep other parts of the active file
    uptodate.

    (A) FEEDER + READER ON SAME BOX, FEEDER MASTERS ARTICLE NUMBERS

	* Diablo feeder is setup to master article numbers ('active' option
	  turned on in diablo.config).

	* Diablo feeder feeds a reader running on the same box.  The reader
	  is configured to slave off the Xref: headers supplied by the feed.

	* In this case, both the reader and feeder are operating on the same
	  dactive.kp file (i.e. the same active file), with the reader handling
	  complex control messages and the feeder simply using it to assign
	  article numbers.

	* This configuration allows your feeder to master article numbering for
	  other reader boxes as well as the local reader.

    (B) FEEDER on one box, READER on another

	* The feeder may master article numbers, but dsyncgroups must be used
	  to synchronize the dactive.kp file from a remote source since the
	  feeder cannot process control messages (i.e. newgroup, rmgroup,
	  etc...).

	* If the feeder masters article numbers, it can keep multiple readers
	  in synch with each other fairly easily.

	* The reader may master article numbers, or it can be configured to
	  slave from the feeder.

	* If not un XRef slave mode, the reader must take a single feed from 
	  the feeder or you must run a mini-feeder on the same machine as 
	  the reader to handle duplicates.  It is suggested that the feeder 
	  give each reader a single feed in order to ensure monotonically 
	  increasing article numbers so 'temporary holes' aren't created 
	  when articles arrive out of order.

	* If in XRef slave mode (-x xrefhost option to dreaderd) you can 
	  take duplicate feeds from the feeder, but the XRef host (the one
	  assigning the article numbers) must be the same for all feeds and 
	  the duplicate articles must be consistent.  No more then three
	  feeds are recommended in this case, and note that duplicate control
	  messages will be executed multiple times.

	* The readers may still use the feeder's spool to fetch articles,
	  and the readers may or may not enable their own article caching
	  capability.

    In all cases, the biggest confusion always occurs in regards to who 
    masters the group/article-number assignments.  The reader can assign its
    own article numbers or it can slave the article number assignments from
    the remote feed by utilizing the Xref: header supplied by the remote
    feed.


			    STEP 1 - FILE SETUP

    Before beginning, you must create additional directories under /news.  All
    directories should be owned by the news user.  The /news/spool/group
    directory is optional if you do not have the reader's native article
    caching turned on.  You typically want /news, /news/spool/cache, 
    /news/spool/group, and /news/spool/news to each have their own partition.
    /news/log and /news/postq can reside on /news.

	/news/spool/cache	reader's article cache, not required if
				'readercache off' is specified in 
				diablo.config.  You usually use the cache
				when you are not running a feeder on the same
				box or you do not wish to allocate disk space
				to a cache.  If you do run a cache, you are
				responsible for setting up cron jobs to
				keep it's disk usage within your bounds.

				The cache operates in a two-level directory
				scheme indexed by message-id hash.  There are
				512 top level directories.

				The size of the cache partition is up to you.
				many topologies don't really need a reader
				cache but in those that do, I suggest an 8G
				cache or something similar.  Just be
				sure your cron job keeps some free space
				available.

	/news/spool/group	reader stores overview files in this directory.
				a 9G partition is recommended if you take 
				full groups and/or a full feed.

	/news/spool/news	feeder stores article files in this directory.
				If you are operating a full spool, it's the
				diablo feeder that typically maintains it.

				The minimum recommended size is 8G.  You can
				use less only if you are using the spool to
				run a feeder on the same box as the reader
				in the 'just to filter out duplicate articles'
				style of topology.

	/news/spool/postq	reader queues posts for future delivery here
				(work still in progress)

	/news/log		reader stores & maintains dreaderd.status file
				here. human readable file.  NEVER EDIT THE
				FILE!

    The following additional files must be configured in order for dreaderd
    to operate.   Samples are available in the samples directory.  All files
    should be owned by the 'news' user.

	/news/diablo.config

	    This file is required.  

	    dreaderd does not automatically detect changes made to this file
	    and must be restarted for changes to take effect.

	    The number of forks, threads, and cache configuration is stored 
	    in diablo.config for the reader.  See samples/diablo.config.
	    When running a feeder on the same box, the 'active' and 
	    'activedrop' configuration items are also important if you
	    want the feed to master the group/article-number assignments.

	/news/dactive.kp

	    This is the 'active' file.  It's actually a database.  The 
	    'dsyncgroups' utility may be used to initialize this file from
	    some remote INN server.  This file is NOT compatible with INN's
	    active file.

	    You can use the dsyncroups program to create an initial 
	    dactive.kp file.  The initial dsyncgroups run will take longer
	    then normal to run because resizing .kp files past a page boundry
	    is expensive.  This is not a problem under normal operation.

	    dreaderd automatically detects changes made to this file, but
	    you may only make adjustments with KP database tools such
	    as dkp.  You may edit the file manually only if dreaderd (and
	    diablo if diablo is mastering the article number assignments) is
	    completely shut down.  Idle isn't good enough.  DO NOT EDIT
	    THIS FILE MANUALLY IF YOU CAN HELP IT!  Use the 'dkp' program.

	/news/dcontrol.ctl

	    This is the control-message configuration file and operates in
	    the same manner as INN's control.ctl file, except that you can
	    specify multiple actions by comma-delimiting them.

	    NOTE!!!! You need to install 'pgp' for pgp verification to work,
	    and you need to setup the pgp key rings properly.

	    dreaderd automatically detects changes made to this file.

	/news/dexpire.ctl

	    This file is used by dexpireover to expire overview information.
	    Overview info is usually expired independantly from the remote
	    spools so you may have to tune it a bit.

	/news/dreader.access

	    This file controls access for incoming feeds and readers, it is
	    NOT compatible with INN's nnrp.access file.

	    dreaderd automatically detects changes made to this file.

	/news/dserver.hosts

	    This file configures reader<->spool connections for both retrieving
	    articles and for post'ing articles.  You should have at least one
	    's' type and one 'o' type server specified.

	    dreaderd will not be able to fetch article bodies if you do not
	    properly setup this file.

	    Each dreaderd process fork opens up all connections specified in
	    dserver.hosts, so if you have two machines in dserver.hosts and
	    set the number of dreaderd forks to 4, 8 connections will be
	    openned.

	    dreaderd automatically detects changes made to this file.

	/news/moderators

	    This is the moderators file returned by the 'list moderators' 
	    NNTP command.  It is identical to the INN one.

	    dreaderd automatically detects changes made to this file.

			    STEP 2 - HEADER FEED SETUP

    Every reader system except those used only as L2 spool caches require a
    header-only feed to operate.  If you cannot supply a header-only feed,
    a normal feed will do but the reader box only stores article headers 
    from incoming feeds.  For header-only feeds, Control messages must still
    be sent in their entirety (i.e. checkgroups, and for pgp verification
    purposes), and the header-only feed must synthesize a Bytes: header so
    the reader knows how big the article would normally be (rough estimate).
    The new Diablo server 'headfeed' option for outgoing feeds will do this.

    NOTE!!! The reader does not maintain its own history.  You cannot supply
    multiple feeds to the reader, but you can run multiple feeds into a diablo
    feeder on the same box and then feed the reader from the feeder. 

    NOTE!!! This does not mean you can't run a permanent spool on the reader
    machine!  It simply means that to do so you must run both the diablo server
    side and the reader side on the same machine.  If you do this, you should
    turn dreaderd's article caching off (see diablo.config).  If you are using
    an off-machine spool I suggest leaving dreader's article caching on.

    Care must also be taken in handling articles posted via the reader.  The
    Diablo reader posts articles directly to some upstream host and expects
    the posted article to trickle back down to the reader to be properly 
    indexed.  However, the article contains the reader's FQDN in its Path:.
    Therefore, the dnewsfeeds entry controling the feed going to the reader
    *cannot* alias the reader's FQDN and instead uses 'alias dummy'.  The 
    samples/dnewsfeeds file contains an example of such a feed.

			    HEADER FEED INTO FEEDER BOX

    Diablo has introduced a 'mode headfeed' command that allows the *feeder*
    side to take a header-only feed and pass it on to a reader.  Safeguards
    are emplaced to prevent dnewslink from attempting to pass header-only
    feeds to destinations that can't handle them.

    This particular setup is often used when you wish to maintain a global
    article spool in one place and use a completely separate path to maintain
    a large number of synchronized readers without wasting a lot of network
    bandwidth.  You can pass a header-only feed from the global spool (or
    other sources) to the header-only feeder box which them masters the
    article numbering and passes header-only feeds to all the readers.  The
    readers then access the master spool via a different path.  

    The advantage of this mechanism is that the feeder box doing the 
    article mastering is only storing headers in its spool and thus does
    not need a very large spool or even a very fast spool.  Passing
    header-only feeds out of a feeder machine which is taking a fully-bodied
    incoming feeds is less efficient because the article bodies are stored
    in the spool and must be 'skipped over' so to speak.

				STEP 3 - MAINTENANCE

    You should setup a cron job to periodically delete cached article files
    in /news/spool/cache, usually with a find -mtime +7 or something similar.
    It depends on the amount of disk space you set aside for the cache.  If
    you are running the diablo feeder on the same box, I suggest turning
    off dreaderd's caching entirely and having it retrieve articles from the
    locally running diablo as a first level cache.

    You should setup a cron job to rotate log files in /news/log generated
    from the dcontrol.ctl configuration.

    You need to 'expire' stale overview information.  The 'dexpireover' 
    program will accomplish this.  This program can both delete stale
    data and update the dactive.kp database if you wish.   Read the manual
    page to dexpireover carefully.  It should normally be run once or twice
    a day (remember, it just expires overview information, it has nothing
    to do with the actual article spool).  The most common problem you
    will run into here is the situation where articles in the spool(s) have
    expired but still exist in the overview database... I am working on 
    dexpireover mechanisms to make this less of a problem.

    The 'dsyncgroups' program can be used to pull in active and newsgroups
    file related information from other servers and is especially useful
    in keeping your dactive.kp database up to date if you do not wish to
    mess around with dcontrol.ctl -- or more specifically, you really only need
    one machine doing active file maintenance and can slave group creation and
    deletion for other machines off that master.  dsyncgroups is capable of
    maintain the article *range* in your dactive.kp file from remote active 
    files even if the active files are not synchronized with each other.

				    MONITORING

    The file /news/log/dreaderd.status holds realtime connection information
    on clients and can be used to monitor the reader system.

    'ps' will also contain an indication of current status.

    When first starting dreaderd, look at the news syslog output for a minute
    or so to ensure that you are not overburdening the spools.  In particular,
    remote diablo spools will scream bloody murder at you if you make too
    many simultanious connections so watch for connect-immediate-disconnects
    with that sort of error.  You may have to adjust the remote spools and/or
    decrease the number of reader forks you allow.

				    PGP KEYS

    In order of dreaderd to properly process PGP keys, the following must be
    properly setup:

	* The pgp program must be installed in /usr/local/bin, modes 755.

	* The PGPKEYS file must be installed in the ~news user's pgp ring
	  using the pgp program.

	* rc.news must 'setenv HOME ~news' prior to running dreaderd.

	* Diablo's 'pgpverify' program is not the pgp program, it's a wrapper
	  for the pgp program that dreaderd uses.

    To setup PGP:
	
	ftp ftp.isc.org
	    (anonymous login)
	    cd pub/pgpcontrol
	    binary
	    dir
	    get PGPKEYS
	    get README
	    quit

	(read the README file)

	(get, compile, and install pgp as /usr/local/bin/pgp)
	    pgp is available as a port for FreeBSD.  Otherwise you
	    can obtain it from utopia.hacktic.nl in /pub/replay/pub/pgp/unix/. 
	    Get version 263is.tar.gz.

	Install the PGPKEYS in ~news's key ring.

	    su - news
	    pgp -kg
	    pgp -ka PGPKEYS	(this one is a bit involved)

	Make sure everything is working right.  While you are still the ~news
	user, do:

	    /news/dbin/pgpverify < samples/pgp-sample

	It should print out: 'news.announce.newgroups'.  Remember that you
	must 'setenv HOME ~news' in rc.news prior to running dreaderd.