Distributed Checksum Clearinghouse Explained

Distributed Checksum Clearinghouse (also referred to as DCC) is a method of spam email detection.

The basic logic in DCC is that most spam mails are sent to many recipients. The same message body appearing many times is therefore bulk email. DCC identifies bulk email by calculating a fuzzy checksum on it and sending that to a DCC server. The server responds with the number of times it has received that checksum. An individual email will create a score of 1 each time it is processed. Bulk mail can be identified because the response number is high. The content is not examined. DCC works over the UDP protocol and uses little bandwidth.

DCC is resistant to hashbusters because "the main DCC checksums are fuzzy and ignore aspects of messages. The fuzzy checksums are changed as spam evolves"^[1] DCC is likely to identify mailing lists as bulk email unless they are white listed. Likewise, repeatedly sending the same email to a server increases its number in the server, and, therefore, the likelihood of it being treated as spam by others.

History

According to the official DCC website:

The DCC is based on an idea of Paul Vixie and on fuzzy body matching to reject spam on a corporate firewall operated by Vernon Schryver starting in 1997. The DCC was designed and written at Rhyolite Software starting in 2000. It has been used in production since the winter of 2000/2001.

References

http://www.rhyolite.com/dcc/ Distributed Checksum Clearinghouses official website