File: README

package info (click to toggle)
diablo 1.13-1
  • links: PTS
  • area: non-free
  • in suites: hamm
  • size: 804 kB
  • ctags: 875
  • sloc: ansic: 8,308; perl: 1,908; sh: 186; csh: 81; makefile: 67
file content (227 lines) | stat: -rw-r--r-- 9,977 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227

		    README FILE FOR DIABLO RELEASE 1.12

		**** NOTICE NOTICE NOTICE 

		**** NOTICE NOTICE NOTICE DHISTORY FILE RELOAD REQUIRED ****
		**** IF UPGRADING FROM 1.07 OR LOWER.  NO RELOAD IS     ****
		**** REQUIRED IF UPGRADING FROM 1.08 or higher		****

				    ---

    ** UPGRADING TO 1.xx FROM 1.07 or earlier requires the dhistory file to be
       reload.  Upgrading from 1.08 to 1.xx does not require the dhistory file
       to be reloaded.

    READ THE INSTALL INSTRUCTIONS *CAREFULLY* AND ALSO READ THE SECTION AT THE
    END OF THE **INSTALL** FILE ENTITLED: 'UPGRADING TO 1.xx' if you are 
    ugprading from 1.07 or earlier to 1.08 or later.

				    ---

    DIABLO is a news transit server.  It is designed to replace INN on a
    newsfeeds machine.  It is NOT currently designed to replace INN on a
    newsreader machine.  That is, Diablo only understands ihave,
    mode stream, and related commands.  It does not understand mode
    reader NNTP commands, and its spool file format is not compatible with
    INN.  Diablo stores files in a spool, expires them, and maintains a 
    dhistory file, but has no concept of an active file.  Articles are 
    named by message-id, so there is no link() problem.

    Typically, anyone taking a full feed these days must dedicate a
    machine to it that is separate from the newsreader machine that your 
    users use to read news.  DIABLO is designed to replace the dedicated
    newsfeeds machine.

    DIABLO solves most of the problems INN has dealing with multiple
    incoming and outgoing feeds and will typically increase the performance
    of your newsfeeds machine by 5x.  Diablo has been successfully deployed
    at BEST Internet, whos newsfeeds machine is configured as follows:

	* Diablo
	* 128MB ram
	* pentium pro 200 running FreeBSD
	* 10BaseT into a FDDI-backed etherswitch
	* three 4G ultra-wide barracuda disks (two striped to make the spool)
	* 9+ fully transited full feeds
	* 60+ outgoing feeds, including two to local newsreader machines

    Switching from INN1.4unoff4 to Diablo yielded a 5x to 10x performance 
    increase in everything except disk I/O.  Disk I/O performance increased
    about 4x, mainly due to the forking nature of the Diablo server.  At
    peak, the news machine runs with a typical cpu load of 0.20, a
    typical network aggregate of 300 to 700 KBytes/sec (that's BYTES/sec),
    and a typical (estimated) I/O saturation of 20%, assuming the
    system is seek-limited which it is pretty much.  At peak,
    FreeBSD typically uses around half the available memory for its buffer
    cache.

			      OS REQUIREMENTS

    You must be running a UNIX-compatible operating system that 
    supports shrared+read-only mmap()'s and POSIX fcntl locks.  
    flock() (or POSIX fcntl locks).  Diablo will not compile otherwise.

    As a matter of principle, I am requiring a minimum of ANSI-C
    level compilation (e.g. prototypes must be supported), flock()/fcntl() 
    on local filesystems, and shared read-only mmap()'s.

    Systems known to work:  Linux, FreeBSD (2.2.2 or greater suggested), 
    SunOS, and Solaris.  AIX is a probable.  

    Systems with problems:  Alpha's (porting issues), BSDI releases prior
    to 3.0 but I am also getting problem reports w/ BSDI 3.0, so for now
    it doesn't work with BSDI (mmap() issues).

			    WHO SHOULD RUN DIABLO

    If you are running separate newsfeed and newsreader machines, then
    diablo is for you.  Even more so if you are dealing with a lot of
    feeds.

    Note that Diablo is not able to supply slave feeds as INN can, so you
    cannot use diablo to slave multiple INN boxes.

			    WHERE TO GET DIABLO

    http://www.backplane.com/diablo/

			    REPORTING BUGS

    send the bug to:		diablo-bugs@backplane.com
    send non-bug stuff to:	diablo@backplane.com
    email to the author:	dillon@backplane.com (Matthew Dillon)


			       DISCUSSION

    I was thinking news.software.nntp for now.  I do not personally
    like mailing lists as they are too difficult to read when the
    posting rate goes up, even when digested.

		   MACHINE CONFIGURATION AND LOADING CONSIDERATIONS

    Load point:		network

	A full feed runs about 45KBytes/sec.  25 full outgoing feeds, or 70
	mixed feeds will fill up a 10BaseT ethernet.  The number of incoming 
	feeds is usually irrellevant, but each full incoming feed should 
	be assumed to generate around 10 history file hits/sec.

	You MUST switch your ethernet, whether it be 10 or 100BaseT, and *NOT*
	hub it.  This is absolutely necessary.  Due to the streaming nature
	of the connections and largish packets, you can physically max out
	the wire over a switched connection without encountering collision
	problems.

    Load point:		cpu

	A pentium-pro 200 on a box running FreeBSD will run out of suds at
	around 150th feed (mixed feeds).  A pentium 90 can support around 60
	feeds.

    Load point:		memory & I/O

				(typical processes)
	# of (mixed) feeds   diablo dnewslink	memory	/news	spool
							# of striped 4G disks

	1-3			6	4	92 MB	1	2
	4-30			60	60	128 MB	1	2
	31-60			90 (1)	90	192 MB	1 (2)	2
	61-100			140	140	256 MB	2	3

	note (1):  When a news box is in catchup mode after being down for
		   a while, the number of incoming diablo server processes 
		   will usually bloat due to remote feeders running more 
		   connections in parallel.  In this case, as many as 120
		   diablo processes may end up running.

		   If you have a lot of incoming feeds, take note that diablo
		   can typically take a full feed from each with only a single
		   connection per feed.  A lot of bloat can be gotten rid of
		   by asking your feeds to make only one or two simultanious
		   connections to you, rather then the 5 or 6 some feeds like
		   to make.

	note (2):  As you approach the top-end of 60 feeds for this 
		   configuration, I/O on the dhistory file may start to
		   saturate a non-striped disk when catching up.

	Disk I/O, in general, is going to be seek-limited.  Diablo can handle
	up to around 60 feeds with a single fast 4G /news disk and an 8G
	spool made up of two 4G drives striped together.  After that point,
	you may need a striped /news disk (two 2G disks striped together).

	If you require longer term spool storage, I recommend you stick with
	4G disks until you have four or more spindles, then go with fast 9G
	disks (not hulkers... e.g. use something like the new seacrate 9G
	disks).  for example, four 4G disks striped together to make one
	16G spool, or four 9G disks striped together to make one 32G spool.

	The ultimate bottleneck will almost certainly be history file lookups
	and appends, where one has many, many processes trying to access the
	same inode.  I already see FreeBSD kernel begin to limit out with 
	120 incoming connections.  Memory and buffer cache tuning will help 
	this situation to a degree, and striping /news will also help.  If you
	actually approach this limit, you may want to consider increasing
	the history file hash table size from 4 million entries (16 MByte 
	memory map) to 8 million entries (32 mbyte memory map), which will 
	shift more of the burden to the VM system and away from the I/O
	syscalls.

	Diablo does not support NFS.  It uses fcntl() locks very heavily and
	NFS is simply too slow.  However, diablo will generally work well on
	hardware SCSI-based RAID systems.  It might even work well on modern
	RAID 5 systems, but I would be extremely careful with anything beyond
	RAID 1.

			USE OF REALTIME FEEDS AND FEED DELAYS

	If you have several outgoing feeds, you should consider using the
	realtime and delay ( d# ) options in dnntpspool.ctl.  all of your
	local and internal feeds should be realtime.  Cheap external paths
	to the internet can also be realtime.  To reduce the cost of running
	outgoing feeds over your internet transit, you may wish to weight
	the feeds according to cost.  For example, our MAE-WEST connection
	is a lot cheaper then our MCI T3, so I run outgoing feeds with 
	MAE-WEST destinations in realtime and run outgoing feeds which go
	via MCI in batch mode with a 10 second delay.  This way the articles
	may actually propogate to the more expensive destinations via other
	means prior to my actually attempting to send them direct.

	Likewise, if you have T1 and frame customrs, it is usually cheaper
	to supply them with a newsfeed yourself rather then force them to
	go to someone over the internet.  This way they are not eating your
	transit bandwidth on newsfeeds.  A realtime feed to those people is
	best.

			    CATCHING UP AFTER BEING DOWN

	The key item to monitor when catching up on incoming feeds after
	being down for a while is the incoming article rate.  Diablo will
	generate a log line for every 1024 articles received that looks like
	this:

    Jun 24 11:03:59 news1 diablo[18153]: DIABLO uptime=7:46 arts=241.000K tested=0 bytes=1.842G fed=12.613M

	You can calculate the article rate by looking at the delta activity
	from two log lines that are around an hour apart from each other.
	If the article rate is above 9 articles/sec, diablo is catching up
	reasonably well.. as of today, a full feed is around 5 articles/sec.

	With a moderate number of incoming feeds, diablo can do around 30
	articles/sec.  If you have a huge number of incoming feeds that are
	all in catchup, in-kernel filesystem locking will begin to interfere
	with the history file lookups and updates.  Diablo will be able to
	maintain a reasonable history file write transaction rate, but the
	lookup rate will suffer.

	This causes diablo to catch up on articles first without appreciably 
	reducing the backlog at remote sites due to slow check-responses. 
	Once it passes a certain threshold, however, and the load on the
	history file turns to mostly-read rather then read/write, the 
	transaction rate will increase dramatically and diablo will generally 
	be able to cleanup the backlogs very quickly after that.