
This issue arose in conversation recently, thus I thought it worth asking here. Which tools are state of the art these days in statistical spam filtering? (Free/open-source and running on Linux are both assumed.) When I last looked (some years ago), the top options appeared to be, in no particular order: Spamassassin (rules + statistical classifier). It works well for many people; the statistical classifier used to be very memory-intensive and I know people for whom SpamAssassin didn't give accurate results even after training. CRM114: this is what I am currently using for my incoming mail. It appears to be in the midst of a rewrite as a library with support for various scripting languages. My initial experiences with it weren't good, but I tried it again several years ago and, this time, it quickly surpassed SpamAssassin when trained to classify my mail. Dspam: also has a good reputation, seems to be maintained to some extent. When last I looked at it in detail, a number of years ago, there were plans to add interesting features for allowing users to share filters so that a new user wouldn't have to train it from an empty database and one user's training could affect other users' filters. There were other projects around, but the above appeared to be the most sophisticated. So it's now 2013... Any changes? Comments?