Distributed Checksum Clearinghouse (DCC) Frequently Answered Questions
Current versions of this list can be found among the DCC web pages
and their mirror.
* What is the Distributed Checksum Clearinghouse or DCC?
* Do the fuzzy checksums ignore "personalizations"?
* How much bandwidth, disk space, and computing does the DCC
* Do I need to run a DCC server?
* What happens to my mail if the DCC crashes?
* How do I mark spam without rejecting it?
* Why doesn't the man command find the man pages?
* Must sendmail be used with the DCC?
* How can the DCC be used with qmail?
* Can the DCC be used with smtpd?
* Can the DCC be used with Exim?
* How can the DCC be used with mail user agents?
* Can the DCC be used with SpamAssassin or other spam filters?
* Must I have the root password to use the DCC?
* Why don't the public DCC servers work? Do I need a client-ID?
* Which ports do I need to open in my firewall?
* Why does the dccd database grow without bound?
* The dccd database is corrupt. What should I do?
* Why did building the DCC fail with a complaint about "Resource
* Why do my DCC clients including cdcc and dccproc complain
about "Resource temporarily unavailable"?
* Why does dccifd or dccm complain about "thread_create()
failed: 11, try again"?
* Why doesn't my DCC client pick my local DCC server?
* If I have a server-ID, do I need a DCC client-ID, or vice
* Why does my DCC server complain about "rejected server-IDs"
among flooded checksum reports?
* Why does my server refuse to accept more than 20 operations
* How do I keep strangers from using my DCC server?
* How can I determine why dccm reported a message as spam or
with a recipient count of "MANY"?
* How can I see what checksums my server has heard from its
* Why is mail from my favorite mailing list marked with an X-DCC
header line that says it is spam?
* Why are some checksums missing from my X-DCC header lines?
* Can I use wild cards or regular expressions in DCC white
* How do I white-list mail from a legitimate bulk mailer using
its name or SMTP headers such as Mailing-List or the Habeas SWE
* Do I need both server and client white lists?
* How do I maintain client white lists?
* When the white list file used by dccm or dccproc is changed,
what must be done to tell the software about the change?
* Why do legitimate mail messages have X-DCC header lines that
say they are "bulk"?
* Are IP address blocks in white lists used by dccproc?
* Why is dccproc is ignoring env_from white list entries?
* Why is the DCC server is ignoring env_from white list entries?
* What if I make a mistake with dccproc -t many and report
legitimate mail as spam?
* Can the sendmail "spamfriend" mechanism tell dccm to not check
mail sent to some addresses?
* How can I avoid polluting the databases of DCC servers with
checksums of my mail that is not spam?
* How many flooding peers does my DCC server need?
* Do I need to tell the operators of other DCC servers the
password for controlling my server to turn on flooding?
* How can I figure out why flooding is not working?
* Why didn't the RTT reported by the cdcc info operation change
when my network topology changed?
* When my clients are configured to use SOCKS, they do not
realize immediately when a server is down.
What is the Distributed Checksum Clearinghouse or DCC?
See the main DCC man page as well as the DCC web pages
and their mirror
Do the fuzzy checksums ignore "personalizations"?
Yes, they ignore many so called "personalizations".
How much bandwidth, disk space, and computing does the DCC require?
The UDP packets used by a DCC client to obtain the checksum
totals from a DCC server for a mail message generally use less
bandwidth than the DNS queries required to receive the same
message. A DCC client needs very little disk space.
Bulk messages are usually logged by DCC clients. On systems
receiving a lot of mail, the mechanisms for automatically
creating new log directories every minute, day, or hour can
keep any single log directory from becoming too large. See the
dccm and dccproc man pages.
As of January, 2004, about 100 MBytes/day are exchanged between
each pair of DCC servers. Each server has 3 or 4 peers. The
resulting database is about 500 MBytes. However, while
dbclean is deleting old checksums, there are three copies
of the database. The DCC clients and server do not need many
CPU cycles, but the daily executions of dbclean on a system
with a DCC server require a computer with at least 768 MBytes
of memory and work better with more.
DCC servers used by clients handling 100,000 or more messages
per day need to be larger. Each additional 100,000 messages/day
need about 100 MBytes of disk space and system memory, given
the default expiration used by dbclean -e.
In early 2004, a DCC server prefers at least 768 MBytes of RAM.
Do I need to run a DCC server?
A mail system that processes fewer than 100,000 mail messages
per day uses less of its own bandwidth and the bandwidth of
other DCC servers by using the public DCC servers. Each mail
message needs a DCC transaction that requires about 100 bytes,
and so 100,000 mail messages/day imply about 10 MBytes/day of
DCC client-server traffic. Each DCC server needs to exchange
"floods" or streams of checksms with 4 other servers. Each
flood is currently about 100 MBytes/day for a current total of
about 400 MBytes/day.
When normally installed by the included Makefiles, DCC clients
are configured to use the public DCC servers without any
additional configuration, except to open firewalls to port
Mail systems that process more than 100,000 mail messages per
day need local DCC servers connected to the global network of
DCC servers. The public DCC servers include denial of service
defenses which ignore requests in excess of about 240,000 per
day per client.
What happens to my mail if the DCC crashes?
When in doubt or trouble, the DCC clients including dccproc
and dccm deliver mail. They wait only a little while for a
DCC server to answer before giving up. They then avoid asking a
server for a while to avoid slowing down mail.
If the DCC sendmail interface or milter program, dccm, crashes,
the default parameters in misc/dcc.m4 for the sendmail.cf
Xdcc line tell sendmail to wait only about 30 seconds before
giving up and delivering the mail.
The DCC client code keeps track of the speeds of the servers it
knows about, and uses the fastest or closest. Every hour or so
it re-resolves A records and checks the speeds of the servers
it is not using. When the current server stops working or gets
significantly slower, the client code switches to a better
How do I mark spam without rejecting it?
Unless given thresholds at which to reject mail, dccm and
dccproc do not reject mail. When dccm is given a threshold
by setting DCCM_REJECT_AT in dcc_conf in the DCC home
directory, DCCM_ARGS can also be set to "-a IGNORE so that
spam is marked but not rejected.
Why doesn't the man command find the man pages?
The nroff source, formated nroff output, and HTML versions of
the man pages are in the top-level source directory. Formatted
or nroff source is installed by default somewhere in
/usr/local/man depending on the target system. It may be
necessary to add /usr/local/man to the MANPATH environment
variable. Even with that, SunOS 5.7 sometimes has trouble
finding them unless man -F is used.
Must sendmail be used with the DCC?
While the sendmail milter interface, dccm and the DCC
program interface or dccifd are the most efficient ways to
report and check DCC checksums, dccproc is also commonly
How can the DCC be used with qmail?
There are comments about using dccproc with qmail in the
DCC mailing list archives including Chris Shenton's
message. See also Chris Shenton's DCC, qmail, and gnus
Can the DCC be used with smtpd?
Yes, dccproc can be used with Obtuse's smtpd. Dave Lugo has
contributed a shell script to the smtpd-sd project which
can be used to do DCC checking prior to the end of the SMTP
Can the DCC be used with Exim?
There are comments about using Dccproc with Exim in the
DCC mailing list archives including these messages:
Can the DCC be used with SpamAssassin or other spam filters?
The DCC can be used with SpamAssassin as well as other spam
and virus filters. Note that it is more efficient to arrange to
use a DCC client daemon such as dccm to mark passing mail
and check X-DCC header lines in the filter than to start and
run dccproc on each message.
Some commercial virus and spam filters include DCC clients that
query public DCC servers or DCC servers operated by the filter
vendor and that "flood" or exchange bulk mail checksums with
How can the DCC be used with mail user agents?
Dccproc can be used with any mail user agent that can check
mail headers. For example, WD Baseley sent a note to the
DCC mailing list on how to configure Eudora to act on
X-DCC header lines.
Bharat Mediratta has developed DeepSix for people using mail
user agents on UNIX boxes connected remote servers such as
corporate Exchange servers. See his project on Sourceforge
as well as his announcement in the DCC mailing list.
Must I have the root password to use the DCC?
No, the procmail or sendmail .forward DCC user programs can be
installed in an individual ~/bin directory. Then cdcc can
create a private map file used with dccproc -h dir or
dccproc -m dir/map.
Also see the DCC installation instructions.
Why don't the public DCC servers work? Do I need a client-ID?
The public DCC servers accept requests from clients using the
anonymous client-ID. Incorrectly configured firewalls often
cause problems. Traceroute can be used to send UDP packets to
test for interfering firewalls. See the answer to the
firewall question below.
Which ports do I need to open in my firewall?
DCC traffic is like DNS traffic. You should treat port 6277
like port 53. Allow outgoing packets to distant UDP port 6277
and incoming packets from distant UDP port 6277.
If `dccproc` fails or the command `cdcc info` says no DCC
servers are answering, you may need to adjust your firewall.
If you run a DCC server, open incoming connections to local TCP
port 6277 from your flooding peers, and outgoing connections to
your flooding peers from your TCP port 6277. Also open UDP port
6277 to IP addresses 188.8.131.52 and 184.108.40.206 for the
DCC server status web page.
See also the discussion of Cisco ACLs at
Why does the dccd database grow without bound?
Dbclean should be run about once a day with a script like
misc/cron-dccd. An entry like misc/crontab can be put
into the crontab file for the user that runs dccd, such as
/var/spool/cron/crontabs/root for Solaris.
The dccd database is corrupt. What should I do?
Dbclean -R will usually repair a broken DCC server
database. However, if your server is "flooding" or exchanging
checksums with other servers, it is often quicker to stop the
DCC server, delete the dcc_db and dcc_db.hash files,
Dbclean -N to create empty database files, and the restart
dccd with the libexec/start-dccd script. When dccd
starts, it will notice that the database has been purged and
ask its flooding peers to rewind and retransmit all of their
Why did building the DCC fail with a complaint about "Resource
The most common cause of this problem is the same the next
question, or bugs in the target platform's fcntl() locking on
NFS file systems. If the DCC home directory will not be NFS
mounted, it is probably sufficient to run make a second time.
Why do my DCC clients including cdcc and dccproc complain
about "Resource temporarily unavailable"?
The most common cause of such messages is holding a lock on the
white list file with an editor. However, perhaps your operating
system has bugs in its implementation of fcntl file locking,
particularly for the DCC client map file when it is on an
NFS file system. If so, try configuring, compiling, and
installing with the --with-bad-locks setting mentioned in
the installation instructions.
Why does dccifd or dccm complain about "thread_create() failed: 11,
The most common cause of "thread_create() failed: 11, try
again" error messages from dccm and dccifd is a too
small limit on the maximum number of processes allowed the UID
running the dccm or dccifd process. The "maxproc" limit should
be a dozen or so larger than the sum of the queue sizes of dccm
or dccifd (or both if both are running).
Why doesn't my DCC client pick my local DCC server?
The DCC clients including dccm and dccproc pick the
nearest and fastest server in the list kept in the
/var/dcc/map file. DCC servers not in that list will not
be used. That list can be viewed with the cdcc info or
cdcc RTT operations. Add to the list with cdcc add or
A nearby server that seems slower than a more distant server
will not be chosen. Note that the anonymous user delay set with
dccd -u is intended to make a server appear slow to
"freeloaders." The "RTT +/-" value that can be used with the
cdcc add and cdcc load operations can be used to
force DCC clients to prefer or avoid servers except when
If I have a server-ID, do I need a DCC client-ID, or vice versa?
DCC server and client-IDs serve distinct purposes. Servers
require server-IDs to identify each other in the floods of
checksums they exchange and to recognize authorized users of
powerful cdcc operations such as stop. DCC servers require
client-IDs to identify paying clients that should be given
quicker service that anonymous clients, to refuse reports from
anonymous clients, or to refuse even to answer queries from
Why does my DCC server complain about "rejected server-IDs" among
flooded checksum reports?
Redundant paths among DCC servers exchanging or flooding
reports of checksums would cause duplicate entries in each
server's database without a mechanism that depends on every DCC
server having a unique server-ID. Parts of that mechanism
detect two servers claiming a single server-ID and server-IDs
that are not listed in the local /var/dcc/ids file.
Reports supposedly from unknown servers are rejected or ignored
by the DCC server.
The ID of every server in the network must be in the file,
usually without its real password. The sample ids file in
the DCC source is a good start for a new DCC server in the
network to which dcc.dcc-servers.net belongs. A current copy of
that file is also in the online copies of the source including
that at Rhyolite Software.
At least one server in every network of DCC servers should use
an ids file without any extra entries to detect rogue server-ID
Why does my server refuse to accept more than 20 operations per
A common cause of such problems is one of the DCC server's
defenses against denial of service attacks. A DCC server cannot
know anything about anonymous clients, or clients using
client-ID 1 or without a client-ID and matching password from
the /var/dcc/ids file. As far as your server can know, an
anonymous client sending many operations is run by an unhappy
sender of unsolicited bulk mail trying to flood your server
with a denial of service attack. It is easy to tell your client
its ID with the cdcc add or load operations.
The default limits can changed by adding an dccd -R
argument can be added to DCCD_ARGS in the dcc_conf file in
the DCC home directory,
How do I keep strangers from using my DCC server?
See the dccd -Q and dccd -u options.
How can I determine why dccm reported a message as spam or with a
recipient count of "MANY"?
Dccm is usually configured to log mail with recipient counts
greater than the -t ,log-thold, as well as mail with some
conflicts among white list entries. Each log file contains
a single message, its checksums, its disposition, and other
information as described in the dccm man page.
See also the dblist -C command.
How can I see what checksums my server has heard from its clients?
The dblist -Hv command displays the contents of the
database. Look for records with your server-ID with
Why is mail from my favorite mailing list marked with an X-DCC header
line that says it is spam?
Sources of solicited bulk mail including mailing lists to which
you have subscribed should usually be in your DCC client
white list so that they receive no X-DCC header lines.
Why are some checksums missing from my X-DCC header lines?
If the DCC client was not able to compute a checksum for a
message, it will not ask the server about that checksum and the
checksum will not appear in the X-DCC header. For example, if
dccproc is not told and cannot figure out the IP address
of the source of the message, that checksum will be missing.
The Fuz1 and Fuz2 checksums cannot be computed for messages
that are too small, and so will be missing for them. A checksum
will also be missing if the DCC server is configured to not
How do I maintain client white lists?
The overall procedure includes monitoring bulk mail in the log
directories specified with dccproc -l, dccm -l, and
dccm -U, and adding entries to white list files.
The global dccm white list file specified with
dccm -w and the white lists specified with dccproc -w
are easily maintained with ordinary text editors. Note that
some text editors including versions of vi lock their files.
Dccm and dccproc are unable to read white list files while they
White lists specified with dccm -U are easily maintained
with ordinary text editors by the system administrator.
However, it is often better to let individual users deal with
their own white lists. The DCC source includes sample CGI
scripts to let individual end-users monitor their private logs
of bulk mail and their individual white lists. See the
README file in that directory.
Can I use wild cards or regular expressions in DCC white lists?
No, regular expressions cannot be used, because DCC client and
server white lists are converted to lists of checksums. The
same basic idea is used for DCC client white lists as for the
DCC protocol. A DCC client computes the checksums for a
message, and then looks for those checksums in the local white
list. Depending on the values associated with those checksums,
the DCC client asks a DCC server about them.
There would also be portability difficulties in including
regular expressions in DCC clients. In other words, consider
the complications of bundling procmail with the DCC code.
To use regular expressions with the DCC, consider procmail.
Procmail is included with many UNIX-like systems. See also the
DCC clients can be configured to white- or blacklist using
called "substitute" headers. See dccproc -S or
It is also possible to use a sendmail access_db file entries to
white- or blacklist based on portions of SMTP envelope and
client IP addresses. For example, an access_db file line of
"From:example.com OK" can be used to tell dccm white-list all
mail from SMTP clients in the example.com domain. See the -O
argument to the misc/hackmc script.
How do I white-list mail from a legitimate bulk mailer using its name
or SMTP headers such as Mailing-List headers?
Start by determining an envelope value or SMTP header that
distinguishes the bulk mail from a sample message or DCC log
file. The name of the sending computer is the mail_host value
in dccm log files. If the distinguishing header or
envelope value is not among the main DCC white list
values, then a "substitute" value must be used. An "ok
substitute ..." line must be added to the white list file and
the DCC client program must be told with dccproc -S or
dccm -S. There are example white list entries in the
sample /var/dcc/whiteclnt file.
Do I need both server and client white lists?
The dccd whitelist file is not as useful as the client
white lists used by dccproc whiteclnt and dccm
whiteclnt files. Entries in a DCC server's white list apply to
all clients that use that server, including clients in other
organizations if permitted. Thus, only very global values are
appropriate for server white lists. Common entries in server
white lists include the 127.1 IP address, the IP address ranges
of the SMTP servers of the organization running the server, and
well known, unimpeachable mailing lists such as CERT's.
Client white lists apply only to the stream of mail handled by
the client. Dccm white lists apply to the mail received by
the associated sendmail process. Distinct organizations and
individual users can have very different notions of what bulk
mail is solicited and what other mail is always unsolicited
When the white list file used by dccm or dccproc is
changed, what must be done to tell the software the change?
The DCC clients notice when their whiteclnt files as well as
included files change and automatically rebuild the
corresponding .dccw hash table files. Changes to the
dccd whitelist are not effective until after dbclean
Note that some text editors including versions of vi lock their
files. Dccm and dccproc are unable to read white list files
while they are locked.
Why do legitimate mail messags have X-DCC header lines that say they
There are several possible causes of such problems. The first
and most obvious is that the mail is solicited bulk mail and
that the source needs to be added to your white list.
Another possible reason is that your individual legitimate mail
messages have not been marked as spam because their Body or
Fuz1 checksum counts are small, but that the IP address or
other checksum counts are large. The IP address checksum count,
for example, is the total of all reports of addressees for that
checksum. That total is independent of the other checksums, and
so counts all reports for all messages with that source IP
address. A source of legitimate mail that has sent a message
that was reported as spam by one of its recipients will often
have the totals for the checksums of its IP address, From
header, and other values be MANY. This is why it usually does
not make sense to reject mail based on what the DCC server
reports for the IP address, From header, and other values that
are not unique to the message. Only the last Received header
line, the Message-ID line, and body checksums can be expected
to be unique and sometimes not the Message-ID and Received
Why is legitimate mail from someone using qmail marked as spam?
A common cause for that and similar complaints involves null or
missing Message-ID header lines. Spam often lacks Message-ID
lines or has a null or "<>" ID, so rejecting mail with null or
missing Message-IDs can be an effective filter. DCC clients
treat missing Message-ID lines as if they were present but
null. The sample whitecommon white list file in the
DCC source includes the line:
many message-id <>
Some Mail Transfer Agents violate section 3.6.4 of RFC 2822 and
do not include Message-ID header lines in mail they send,
including some combinations of qmail and "sendmail -bs" acting
as the originating MTA, and qmail by itself when it is
generates a non-delivery message or "bounce." Solutions to this
problem include removing that line from your white lists
or adding lines specifying the From or envelope from values of
senders of legitimate mail lacking Message-ID header lines.
Are IP address blocks in white lists used by dccproc?
Yes, dccproc can white-list mail by the IP address of the
immediately preceding SMTP client, but only if it knows that IP
address. Unless the dccproc -a or dccproc -R options
are used, dccproc does not know the IP address.
Why is dccproc is ignoring env_from white list entries?
DCC checksums are of the entire header line or envelope value.
An entry in the white list file for email@example.com will
have no effect on mail with an envelope value of
"J.Smith" firstname.lastname@example.org. The file must contain
Another common cause for this problem is implied by the fact
that for an env_from white list entry to have any effect,
dccproc must be able to find the envelope value in the message
in a Return-Path header or -f must be used. If your mail
delivery agent does not add a Return-Path header and you do not
use dccproc -f, then dccproc cannot know about white or
blacklist entries for envelope return addresses.
Note also that dccproc has no white list by default and that
dccproc -w must be used.
Why is the DCC server is ignoring env_from white list
Common causes of this problem include sendmail access_db file
entries and blacklisting entries in the DCC client white
list. Entries in the sendmail access_db or the dccproc or
dccm whitelist override the DCC server's advice.
Note also that it is common for a DCC client to be configured
to use the current nearest of several DCC servers. If one of
the DCC servers does not have the entry in its white list, the
DCC client will occasionally not benefit from it.
What if I make a mistake with dccproc -t many and report
legitimate mail as spam?
It is possible to delete checksums from the distributed DCC
database with the cdcc delck operation. However, it is not
worth the trouble. Unless the same (as far as the fuzzy
checksums are concerned) message is sent again, no one is
likely to notice the mistake before the report of the message's
checksums expire from the DCC servers' databases for lack of
Can the sendmail "spamfriend" mechanism tell dccm to not check
mail sent to some addresses?
Sendmail decisions to accept, reject, or discard mail are
largely independent of the decisions made by dccm. The DCC
equivalent is to add env_to entries to the dccm white
list. See the sample whiteclnt file in the DCC source
However, if your sendmail.cf file sets the dcc_notspam
macro while processing the envelope, then the message will by
white-listed. This is related to the dcc_isspam macro used
by sendmail.cf modified by misc/hackmc -R to tell dccm to
report blacklisted messages as spam to the DCC server.
How can I avoid polluting databases of DCC servers with checksums of
my mail that is not spam?
Reports of checksums with white list entries in your
server's database are not flooded to its peers. The checksums
of messages white-listed with entries in local dccm or
dccproc white lists are not reported to DCC servers. It is
good to add entries to DCC server and client white lists
for localhost, your IP address blocks, and your domains if you
know that none of your users will ever send spam.
However, in the common mode in which the DCC is used, no
checksums of mail are pollution. Checksums of genuinely private
mail will have target counts of 1 or a small number, and so
will not be flooded by your server to other servers. Strangers
will not see your private mail and so will not be able to ask
any DCC server about the checksums of your private mail. On the
other hand, the DCC functions best by collecting reports of the
receipt of bulk mail as soon as possible. That implies that it
is generally desirable to send reports of all mail to a DCC
The DCC flooding protocol does not send checksums with counts
below a DCC server's bulk threshold to other servers.
How many flooding peers does my DCC server need?
A DCC server in a network of many servers should have at least
three flooding peers to ensure that the failure of a single
server or network link cannot partition the network. Limiting
the number the number of peers of any server to four or perhaps
a few more ensures that no single server is critical to the
network. To minimize the distances in the network, four peers
per server seem necessary.
An organization with more than one server can be viewed as a
single server by other organizations, with its servers flooding
each other and external peers spread among its servers. This
protects the network should the organization suffer large scale
problems while protecting the organization from single points
Do I need to tell the operators of other DCC servers the password for
controlling my server to turn on flooding?
No, you do not need to and generally should not tell other DCC
server operators the passwords for controlling your server with
the cdcc command. Every Inter-server flood of checksums is
authorized by lines in each server's /var/dcc/flod file
and authenticated by the password associated with the
passwd-ID in those lines. The passwd-ID is a
server-ID defined in the /var/dcc/ids file that
should generally be used only to authenticate floods of
How can I figure out why flooding is not working?
Many DCC server problems can be diagnosed by turning on one or
more of the tracing modes in the server with the
cdcc trace operation or by restarting the server with
The cdcc flood list operation displays the current
flooding peers of a DCC server. Counts of checksum reports sent
and received to and from a single peer can be displayed with
cdcc "flood stats ID"
The positions in the local database of outgoing streams of
checksums are displayed by the start of dblist -Hv.
Why didn't the RTT reported by the cdcc info operation change
when my network topology changed?
The RTT or round trip time is an average value. Changes in
network topology, server load, and so forth are not immediately
reflected in the RTT to avoid switching DCC servers too
When my clients are configured to use SOCKS, they do not realize
immediately when a server is down.
When configured to use SOCKS, DCC clients cannot "connect" to a
server and so do not receive ICMP errors and must wait for
timeouts to know the server is not answering.
This document describes DCC version 1.2.74.