File: dcc.8

package info (click to toggle)
dcc 1.2.74-2
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 3,552 kB
  • ctags: 4,041
  • sloc: ansic: 41,034; perl: 2,310; sh: 2,186; makefile: 224
file content (820 lines) | stat: -rw-r--r-- 29,651 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
.\" Copyright (c) 2005 by Rhyolite Software
.\"
.\" Permission to use, copy, modify, and distribute this software for any
.\" purpose with or without fee is hereby granted, provided that the above
.\" copyright notice and this permission notice appear in all copies.
.\"
.\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE DISCLAIMS ALL
.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
.\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE
.\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
.\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
.\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
.\" SOFTWARE.
.\"
.\" Rhyolite Software DCC 1.2.74-1.68 $Revision$
.\"
.Dd 2005/03/16 13:58:09
.ds volume-ds-DCC Distributed Checksum Clearinghouse
.Dt DCC 8 DCC
.Sh NAME
.Nm DCC
.Nd Distributed Checksum Clearinghouse
.Sh DESCRIPTION
The Distributed Checksum Clearinghouse or
.Nm
is a cooperative, distributed
system intended to detect "bulk" mail or mail sent to many people.
It allows individuals receiving a single mail message to determine
that many
other people have received essentially identical copies of the message
and so reject or discard the message.
.Pp
Freely redistributable source for the server, client, and utilities
is available at Rhyolite Software, http://www.rhyolite.com/dcc/
.Ss How the DCC Is Used
The DCC can be viewed as a tool for end users to enforce their
right to "opt-in" to streams of bulk mail
by refusing bulk mail except from sources in a "whitelist."
Whitelists are the responsibility of DCC clients,
since only they know which bulk mail they solicited.
.Pp
The only false positives (mail marked as "bulk" by a DCC server that
is not) occur when one of the recipients of a message report it
to a DCC server as having been received many times
or when the "fuzzy" checksums of differing messages are the same.
The fuzzy checksums ignore aspects of messages in order to compute
identical checksums for substantially identical messages.
The fuzzy checksums are designed to ignore only
differences that do not affect meanings.
.Pp
It is not reasonable to worry about third parties reporting your incoming
or outgoing mail to a DCC server as bulk unless you give them copies.
If you trust yourself and your correspondents to not report your
mutual mail as bulk, then false positives are not a concern.
.Pp
A DCC server computes a lower bound on the total number of addresses
to which a message has been sent by counting checksums reported by DCC clients.
Each client must decide which
bulk messages are unsolicited and what degree of "bulkiness" is objectionable.
Client DCC software marks, rejects, or discards mail that is bulk
according to local thresholds on target addresses from DCC servers
and unsolicited according to local whitelists.
DCC servers are usually configured to receive reports from as many targets
as possible, including sources that cannot be trusted to not exaggerate the
number of copies of a message they see.
An end user of a DCC
client angry about receiving a message could report it with
10,000,000 separate DCC packets or with a single report claiming as
many targets.
An unprincipled user could subscribe a "spam trap" to mailing lists
such as those of the IETF or CERT.
Such abuses of the system area not problems,
because much legitimate mail is "bulk."
You cannot reject bulk mail unless you have a whitelist of sources
of legitimate bulk mail.
.Pp
The DCC can also be used by an Internet service provider to detect bulk
mail coming from its own customers.
In such circumstances, the DCC client might be configured to only log
bulk mail from unexpected (not white-listed) sources.
See the
.Fl N
option for
.Xr dccm 8
or
.Xr dccifd 8 .
.Ss What the DCC Is
A DCC server accumulates counts of cryptographically secure checksums of
messages but not the messages themselves.
It exchanges reports of frequently seen checksums with other servers.
DCC clients send reports of checksums related to incoming mail to
a nearby DCC server running
.Xr dccd 8 .
Each report from a client includes the number of recipients for the message.
A DCC server accumulates the reports and responds to clients the
the current total number of recipients for each checksum.
The client adds an SMTP header to incoming mail containing the total
counts.
It then discards or rejects mail that is not "white-listed" and has
counts that exceed local thresholds.
.Pp
A special value of the number of addressees is "MANY" and means
it is certain that this message was bulk and might be unsolicited,
perhaps because it came from a locally blacklisted source or was
addressed to an invalid address or "spam trap."
The special value "MANY" is merely the largest value
that fits in the fixed sized field containing the count of addressees.
That "infinity" accumulated total can be reached with millions of
independent reports as well as with one or two.
.Pp
DCC servers share or
.Em flood
reports of checksums that are seen frequently.
Each server has its own threshold for determining "frequently,"
because a message sent to 50 addressees in a domain with 60 mailboxes
is more likely to be unsolicited bulk advertising than a message sent
to 100 addressees in a domain with 600,000 mailboxes.
.Pp
To keep a server's database of checksums from growing without bound,
checksums are forgotten when they become old.
Checksums with large totals are kept longer.
See
.Xr dbclean 8 .
.Pp
DCC clients pick the nearest working DCC server using a small shared
or memory mapped file,
.Pa /var/dcc/map .
It contains server names, port numbers, passwords, recent performance
measures, and so forth.
This file allows clients to use quick retransmission timeouts
and to waste little time on servers that have temporarily
stopped working or become unreachable.
The utility program
.Xr cdcc 8
is used to maintain this file as well as to check the health of servers.
.Ss X-DCC Headers
The DCC includes several programs used by clients.
.Xr Dccm 8
uses the sendmail "milter" interface to query a DCC server,
add header lines to incoming mail,
and reject mail whose total checksum counts are high.
Dccm is intended to be run with SMTP servers using sendmail.
.Pp
.Xr Dccproc 8
adds header lines to mail presented by file name or
.Pa stdin ,
but relies on other programs
such as procmail to deal with mail with large counts.
.Xr Dccsight 8
is similar but deals with previously computed checksums.
.Pp
.Xr Dccifd 8
is similar to dccproc but is not run separately for each mail message
and so is far more efficient.
It receives mail messages via a socket somewhat like dccm,
but with a simpler protocol that can be used by Perl scripts
or other programs.
.Pp
DCC SMTP header lines are of the form:
.Bd -literal -offset 2n
X-DCC-brand-Metrics: chost server-ID; bulk cknm1=count cknm2=count ...
.Ed
where
.Bl -hang -offset 3n -compact
.It Em brand
is the "brand name" of the DCC server, such as "RHYOLITE".
.It Em chost
is the name or IP address of the DCC client that added the
header line to the SMTP message.
.It Em server-ID
is the numeric ID of the DCC server that the DCC client contacted.
.It Em bulk
is present if one or more checksum counts exceeded the DCC client's
thresholds to make the message "bulky."
.It Em cknm1 , Ns Em cknm2 , Ns ...
are types of checksums, and one of
.Bl -hang -offset 2n -width "Message-IDx" -compact
.It Em IP
address of SMTP client
.It Em env_From
SMTP envelope value
.It Em From
SMTP header line
.It Em Message-ID
SMTP header line
.It Em Received
last Received: header line in the SMTP message
.It Em substitute
SMTP header line chosen by the DCC client, prefixed with the name of
the header
.It Em Body
SMTP body ignoring white-space
.It Em Fuz1
filtered or "fuzzy" body checksum
.It Em Fuz2
another filtered or "fuzzy" body checksum
.El
Counts for
.Em IP , env_From , From ,
.Em Message-Id , Received ,
and
.Em substitute
checksums are omitted by the DCC client if the server
says it has no information.
Counts for
.Em Body , Fuz1 ,
and
.Em Fuz2
are omitted if the message body is empty or
contains too little of the right kind of information
for the checksum to be computed.
.It Em count
is the total number of recipients of messages with that
checksum reported directly or indirectly to the DCC server.
The special count "MANY" means that DCC client have claimed that
the message is directed at millions of recipients.
"MANY" imples the message definitely bulk, but not necessarily unsolicited.
The special counts "OK" and "OK2" mean the checksum has been
marked "good" or "half-good" by DCC servers.
.El
.Pp
An example header line is:
.Bd -literal -offset 2n
X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; Body=16 Fuz1=16 Fuz2=16
.Ed
.Pp
DCC clients commonly accept any mail regardless of other
checksum counts with at least one "OK" or at least two "OK2" counts
among IP, env_from, and From checksum counts.
It is common
to reject other mail with large (including "MANY") counts among
Received, Body, Fuz1, and Fuz2 counts.
It is generally not wise to reject
mail based on the other counts.
For example, "MAILER-DAEMON" appears to send vast quantities of mail.
.Pp
.Ss Mailing lists
Legitimate mailing list traffic differs from spam only in being solicited
by recipients.
Each client should have a private whitelist.
.Pp
DCC whitelists can also mark mail as unsolicited bulk using
blacklist entries for commonly forged marks such as "From: user@public.com".
.Pp
Systems that send many essentially identical copies of solicited
mail such as "auto-responders," should be in the DCC servers whitelists
because their messages are often substantially identical and so "bulk."
.Ss White and Blacklists
DCC server and client whitelist files share a common format.
Server files are always named
.Pa whitelist
and one is required to be in the DCC home directory
with the other server files.
Client whitelist files are
commonly named
.Pa whiteclnt
in the DCC home directory or a subdirectory specified with the
.Fl U
option for
.Xr dccm 8 .
They specify mail that should not be reported to a DCC server or that is
unsolicited bulk.
.Pp
A DCC whitelist file contains blank lines, comments starting
with "#",
and lines of the forms:
.Bl -inset -offset indent -compact
.It Ar include Ar pathname
.It Ar option Ar setting
.It Ar count Em ip Ar hostname
.It Ar count Em env_From Ar 821-path
.It Ar count Em env_To Ar dest-mailbox
.It Ar count Em From Ar 822-mailbox
.It Ar count Em substitute Ar header string
.It Ar count Em Message-ID Ar <string>
.It Ar count Em Received Ar string
.It Ar count Ar hex_type Ar hex_cksum
.El
where
.Bl -hang -offset indent -width 2n -compact
.It Ar include
can occur only in the main whitelist file.
.It Ar pathname
should be absolute or relative to the DCC home directory.
.It Ar option setting
can only be in a DCC client whitelist or whiteclnt file and affect only
.Xr dccifd 8
and
.Xr dccm 8 .
Settings in per-user whiteclnt files override settings
in the global file.
.Ar Setting
can be
.Bl -tag -offset 4n -width log-all -compact
.It Ar log-all
to log all mail messages.
.It Ar log-normal
to log only messages that meet the logging thresholds.
.It Ar dcc-on
.It Ar dcc-off
Control DCC filtering.
See the discussion of
.Fl W
for
.Xr dccm 8
and
.Xr dccifd 8 .
.It Ar greylist-off
.It Ar greylist-on
to control greylisting.
Greylisting for other recipients in the same SMTP transaction
can still cause greylist temporary rejections.
.Ar greylist-off
in the main whiteclnt file.
.It Ar greylist-log-on
.It Ar greylist-log-off
to control logging of greylisted mail messages.
.It Ar DNSBL-on
.It Ar DNSBL-off
honor or ignore results of DNS blacklist checks configured with
.Fl B
for
.Xr dccm 8
and
.Xr dccifd 8 .
.It The default in the main whiteclnt file is equivalent to
.Ar option log-normal
.br
.Ar option dcc-on
.br
.Ar option greylist-on
.br
.Ar option greylist-log-on
.br
.Ar option DNSBL-off
.El
.It Ar count
is null and assumed to be the same as on the previous line
or one of
.Bl -tag -offset 4n -width many -compact
.It Ar MANY
indicating millions of targets have received messages with that checksum.
.It Ar OK
if the message is OK.
.It Ar OK2
if it is "half OK."
Two
.Ar OK2
checksums associated with a message are generally
equivalent to an
.Ar OK .
.El
.It Ar hostname
is an
.Bl -tag -offset 4n -width hostname -compact
.It address
IPv4 or IPv6.
.It block
of 2 to 1024 IPv4 or IPv6 addresses in the
standard form xxx.yyy.zzz.www/mm with
mm limited for server whitelists to 16 for IPv4 or 112 for IPv6.
.It name
that will be converted to one or more IP addresses.
.El
.It Ar dest-mailbox
is an RFC\ 821 address or a local user name.
.It Ar 821-path
is an RFC\ 821 address.
.It Ar 822-mailbox
is an RFC\ 822 address with optional name.
.It Ar header
is the name of an SMTP header such as "Sender" or
the name of one of two SMTP envlope values, "HELO" or
"Mail_Host" for the sendmail resolved host name from the
.Ar 821-path
in
the message's
.Ar 821-path .
.It Ar hex_type
is the string
.Em hex
followed by a blank and one of the preceding checksum types or
.Em body , Fuz1 ,
or
.Em Fuz2 .
.It Ar hex_cksum
is a string of four hexadecimal numbers obtained from
a DCC log file.
.El
.Pp
A DCC server never shares or
.Em floods
reports containing checksums
marked in its whitelist with OK or OK2 to other servers.
A DCC client does not report or ask its server about messages
with a checksum marked OK or OK2 in the client whitelist.
This is intended to allow a DCC client to keep private mail
so private that even its checksums are not disclosed.
.Pp
Checksums of the IP address of the SMTP client sending a mail message
are practically unforgeable, because it is impractical for
an SMTP client to "spoof" its address or pretend to use some other IP address.
That would make the IP address of the sender useful for white-listing,
except that the IP address of the SMTP client
is often not available to users of
.Xr dccproc 8 .
In addition, legitimate mail relays make whitelist entries for IP
addresses of little use.
For example,
the IP address from which a message arrived might be that of a
local relay instead of the home address of a white-listed mailing list.
.Pp
Envelope and header
.Ar From
values can be forged,
so whitelist entries for their checksums are not completely reliable.
.Pp
Checksums of
.Ar env_To
values are never sent to DCC servers.
They are valid in only
.Pa whiteclnt
files
and used only by
.Xr dccm 8 ,
.Xr dccifd 8 ,
and other DCC clients with access to the envelope
.Em Rcpt To
value.
They are another mechanism used by DCC clients to protect the
privacy of some mail.
.Ss Greylists
The DCC server,
.Xr dccd 8 ,
can be used to maintain a greylist database for some DCC clients
including
.Xr dccm 8
and
.Xr dccifd 8 .
Greylisting involves temporarily refusing mail from unfamiliar
SMTP clients and is unrelated to Distributed Checksum Clearinghouses.
.br
See http://projects.puremagic.com/greylisting/
.Ss Privacy
Because sending mail is a less private act than receiving it,
and because sending bulk mail is usually not private at all
and cannot be very private,
the DCC tries first to protect the privacy of mail recipients,
and second the privacy of senders of mail that is not bulk.
.Pp
DCC clients necessarily disclose some information about mail they have
received.
The DCC database contains checksums of mail bodies,
header lines, and source addresses.
While it contains significantly less information than is
available by "snooping" on Internet links,
it is important that the DCC database be treated as containing
sensitive information and to not put the most private information
in the DCC database.
Given the contents of a message, one might determine
whether that message has been received
by a system that subscribes to the DCC.
Guesses about the sender and addressee of a message can also be
validated if the checksums of the message have been sent to a DCC server.
.Pp
Because the DCC is distributed,
organizations can operate their own DCC servers, and configure
them to share or "flood" only the checksums of bulk mail that is not
in local whitelists.
.Pp
DCC clients should not report the checksums of messages known to be
private to a DCC server.
For example, checksums of messages local to
a system or that are otherwise known a priori to not be unsolicited bulk
should not be sent to a remote DCC server.
This can accomplished by adding entries for the sender to the
client's local whitelist file.
Client whitelist files can also include entries for email recipients
whose mail should not be reported to a DCC server.
.Pp
Additional privacy protections are provided by the thresholds
at which DCC servers exchange or
.Em flood
reports.
These thresholds are
primarily intended to reduce the traffic among DCC servers using the
observation that the vast majority of messages are sent to a handful of
addressees and so are useless to other DCC servers.
A DCC server's peer reporting thresholds also ensure that
checksums shared
with peer DCC servers are "bulk" and so intrinsically not private.
.Ss Security
Whenever considering security,
one must first consider the risks.
The worst DCC security problems are
unauthorized commands to a DCC service,
denial of the DCC service,
and corruption of DCC data.
The worst that can be done with remote commands to a DCC server is
to turn it off or otherwise cause it to stop responding.
The DCC is designed to fail gracefully,
so that a denial of service attack
would at worst allow delivery of mail that would otherwise be rejected.
Corruption of DCC data might at worst cause mail that is already
somewhat "bulk" by virtue of being received by two or more people
to appear have higher recipient numbers.
Since all DCC users
.Em must
"white-list" all sources of legitimate bulk mail,
this is also not a concern.
Such security risks should be addressed,
but only with defenses that don't cost more than the possible damage from
an attack..
.Pp
The DCC must contend with senders of unsolicited bulk mail who
resort to unlawful actions
to express their displeasure at having their advertising blocked.
Because the DCC protocol is based
on UDP, an unhappy advertiser could try to
flood a clearinghouse server with
packets supposedly from subscribers or non-subscribers.
DCC servers defend against that attack by rate-limit requests
from non-subscribers.
.Pp
Also because of the use of UDP, clients must be protected
against forged answers to their queries.
Otherwise an unsolicited bulk mail advertiser could send
a stream of "not spam" answers to an SMTP
client while simultaneously sending mail that would otherwise be
rejected.
This is not a problem for authenticated clients of the
DCC because they share a secret with the DCC.
Unauthenticated DCC
clients do not share any secrets with the DCC, except for unique and
unpredictable bits in each query or report sent to the DCC.
Therefore, DCC servers cryptographically sign answers to
unauthenticated clients with bits from the corresponding queries.
This protects against attackers that do not
have access to the stream of packets from the DCC client.
.Pp
The passwords or shared secrets used in the DCC client and server programs
are "cleartext" for several reasons.
In any shared secret authentication system,
at least one party must know the secret or keep the secret in cleartext.
You could encrypt the secrets in a file, but because they are used
by programs, you would need a cleartext copy of the key to decrypt
the file somewhere in the system, making such a scheme more expensive
but no more secure than a file of cleartext passwords.
Asymmetric systems such as that used in UNIX allow one party to not
know the secrets, but they must be and are
designed to be computationally expensive when used in applications
like the DCC that involve thousands or more authentication checks per second.
Moreover, because of "dictionary attacks,"
asymmetric systems are now little more secure than
keeping passwords in cleartext.
An adversary can compare the hash values of combinations of common words
with /etc/passwd hash values to look for bad passwords.
Worse, by the nature of a client/server protocol like that used in
the DCC or a UNIX shell login,
clients must have the cleartext password.
Since it is among the more numerous and much less secure clients
that adversaries would seek files of DCC passwords,
it would be a waste to complicate the DCC server with an asymmetric
system like that used by UNIX.
.Pp
The DCC protocol is vulnerable to dictionary attacks to recover passwords.
An adversary could capture some DCC packets, and then check to see
if any of the 100,000 to 1,000,000 passwords in so called
"cracker dictionaries"
applied to a packet generated the same signature.
This is a concern only if DCC passwords are poorly chosen, such
as any combination of words in an English dictionary.
There are ways to prevent this vulnerability regardless of
how badly passwords are chosen, but they are computationally expensive
and require additional network round trips.
Since DCC passwords are created and typed into files once
and do not need to be remembered by people,
it is cheaper and quite easy to simply choose good passwords
that are not in dictionaries.
.Ss Reliability
It is better to fail to filter unsolicited bulk mail than to fail
to deliver legitimate mail, so DCC clients fail in the direction of
assuming that mail is legitimate or even white-listed.
.Pp
A DCC client sends a report or other request and waits for an answer.
If no answer arrives within a reasonable time,
the client retransmits.
There are many things that
might result in the client not receiving an answer,
but the most important is packet loss.
If the client's request does not reach the server,
it is easy and harmless for the client to retransmit.
If the client's request reached the server but the server's response was lost,
a retransmission to the same server would be misunderstood as
a new report of another copy of the same message unless it is detected
as a retransmission by the server.
The DCC protocol includes transactions identifiers for this purpose.
If the client retransmitted to a second server,
the retransmission would be misunderstood by the second server as
a new report of the same message.
.Pp
Each request from a client includes a timestamp to aid the client in
measuring the round trip time to the server and to let the client pick
the closest server.
Clients monitor the speed of all of the servers they know including
those they are not currently using,
and use the quickest.
.Ss Client and Server-IDs
Servers and clients use numbers or IDs to identify themselves.
ID 1 is reserved for anonymous, unauthenticated clients.
All other IDs are associated with a pair of passwords in the
.Pa ids
file, the
current and next or previous and current passwords.
Clients included their client IDs in their messages.
When they are not using the anonymous ID,
they digitally sign their messages to servers with the first password
associated with their client-ID.
Servers treat messages with signatures that match neither of the passwords
for the client-ID in their own
.Pa ids
file as if the client had used the anonymous ID.
.Pp
Each server has a unique
.Em server-ID
less than 32768.
Servers use their IDs to identify checksums that they
.Em flood
to other servers.
Each server expects local clients sending administrative
commands to use the server's ID and sign administrative commands
with the associated password.
.Pp
Server-IDs must be unique among all systems that share reports
by "flooding."
All servers must be told of the IDs all other servers whose
reports can be received in the local
.Pa /var/dcc/flod
file described in
.Xr dccd 8 .
However, server-IDs can be mapped during flooding between
independent DCC organizations.
.Pp
.Em Passwd-IDs
are server-IDs that should not be assigned to servers but used to specify
passwords used in the inter-server flooding protocol.
They are used in publicly readable configuration files
to specify passwords in private files.
.Pp
The client identified by a
.Em client-ID
might be a single computer with a
single IP address, a single but multi-homed computer, or many computers.
Client-IDs are not used to identify checksum reports, but
the organization operating the client.
A client-ID need only be unique among clients using a single server.
A single client can use different client-IDs for different servers,
each client-ID authenticated with a separate password.
.Pp
An obscure but important part of all of this is that the
inter-server flooding algorithm
depends on server-IDs and timestamps attached to reports of checksums.
The inter-server flooding mechanism
requires cooperating DCC servers to maintain reasonable clocks
ticking in UTC.
Clients include timestamps in their requests, but as long as their
timestamps are unlikely to be repeated, they need not be very accurate.
.Ss Installation Considerations
DCC clients on a computer share information about which servers
are currently working and their speeds in a shared memory segment.
This segment also contains server host names, IP addresses, and
the passwords needed to authenticate known clients to servers.
That generally requires that
.Xr dccm 8 ,
.Xr dccproc 8 ,
.Xr dccifd 8 ,
and
.Xr cdcc 8
execute with an UID that
can write to the DCC home directory and its files.
The sendmail interface, dccm,
is a daemon that can be started by an "rc" or other script already
running with the correct UID.
The other two, dccproc and cdcc need to be set-UID because they are
used by end users.
They relinquish set-UID privileges when not needed.
.Pp
Files that contain cleartext passwords including the shared file used by clients
must be readable only by "owner."
.Pp
The data files required by a DCC can be in a single "home" directory,
often
.Pa /var/dcc .
Distinct DCC servers can run on a single computer, provided they use
distinct UDP port numbers and home directories.
It is possible and convenient for the DCC clients using a server
on the same computer to use the same home directory as the server.
.Pp
The DCC source distribution includes sample control files.
They should be modified appropriately and then copied to the DCC
home directory.
Files that contain cleartext passwords must not be publicly readable.
.Pp
The DCC source includes "feature" m4 files to configure
sendmail to use
.Xr dccm 8
to check a DCC server about incoming mail.
.Pp
See also the INSTALL.txt or INSTALL.html file.
.Ss Client Installation
Installing a DCC client starts with obtaining or compiling program binaries
for the client server data control tool,
.Xr cdcc 8 .
Installing the sendmail DCC interface,
.Xr dccm 8 ,
or
.Xr dccproc 8 ,
the general or
.Xr procmail 1
interface
is the main part of the client installation.
Connecting the DCC to sendmail with dccm is most powerful,
but requires administrative control of the system running sendmail.
.Pp
As noted above, cdcc and dccproc should be
set-UID to a suitable UID.
Root or 0 is thought to be safe for both, because they are
careful to release privileges except when they need them to
read or write files in the DCC home directory.
A DCC home directory should be created, often in
.Pa /var/dcc .
It must be owned and writable by the UID to which cdcc is set.
.Pp
After the DCC client programs have been obtained,
contact the operator(s) of the chosen DCC server(s)
to obtain
each server's
hostname,
port number,
and a
.Em client-ID
and corresponding password.
No client-IDs or passwords are needed touse
DCC servers that allow anonymous clients.
Use the
.Em load
or
.Em add
commands
of cdcc to create a
.Pa map
file in the DCC home directory.
It is usually necessary to create a client whitelist file of
the format described above.
To accommodate users sharing a computer but not ideas about what
is solicited bulk mail,
the client whitelist file can be any valid path name
and need not be in the DCC home directory.
.Pp
If dccm is chosen,
arrange to start it with suitable arguments
before sendmail is started.
See the
.Pa homedir/dcc_conf
file and the
.Pa misc/rcDCC
script in the DCC source.
The procmail DCCM interface,
.Xr dccproc 8 ,
can be run manually or by a
.Xr procmailrc 5
rule.
.Ss Server Installation
The DCC server,
.Xr dccd 8 ,
also requires that the DCC home directory exist.
It does not use the client shared or memory mapped file of server
addresses,
but it requires other files.
One is the
.Pa ids
file of client-IDs,  server-IDs, and corresponding passwords.
Another is a
.Pa flod
file of peers that send and receive floods of reports of checksums
with large counts.
Both files are described
in
.Xr dccd 8 .
.Pp
The server daemon should be started when the system is rebooted,
probably before sendmail.
See the
.Pa misc/rcDCC
and
.Pa misc/start-dccd
files in the DCC source.
.Pp
The database should be cleaned regularly with
.Xr dbclean 8
such as by running the crontab job that is in the misc directory.
.Sh SEE ALSO
.Xr cdcc 8 ,
.Xr dbclean 8 ,
.Xr dcc 8 ,
.Xr dccd 8 ,
.Xr dccifd 8 ,
.Xr dccm 8 ,
.Xr dccproc 8 ,
.Xr dblist 8 ,
.Xr dccsight 8 ,
.Xr sendmail 8 .
.Sh HISTORY
The Distributed Checksum Clearinghouse is based on an idea of Paul Vixie
with code designed and written at Rhyolite Software starting in 2000.
This describes version 1.2.74.
.\"  LocalWords:  UID dbclean dccm dcc dccproc cdcc dccd dblist procmail CERT
.\"  LocalWords:  cleartext dccsight Whitelists whitelist greylisted greylist
\"  LocalWords:  whiteclnt DNS dccifd DNSBL