Statistical spam filters

1 May 2013

      This issue arose in conversation recently, thus I thought it worth asking
here.

Which tools are state of the art these days in statistical spam filtering?

(Free/open-source and running on Linux are both assumed.)

When I last looked (some years ago), the top options appeared to be, in no
particular order:

Spamassassin (rules + statistical classifier). It works well for many people;
the statistical classifier used to be very memory-intensive and I know people
for whom SpamAssassin didn't give accurate results even after training.

CRM114: this is what I am currently using for my incoming mail. It appears to
be in the midst of a rewrite as a library with support for various scripting
languages. My initial experiences with it weren't good, but I tried it again
several years ago and, this time, it quickly surpassed SpamAssassin when
trained to classify my mail.

Dspam: also has a good reputation, seems to be maintained to some extent. When
last I looked at it in detail, a number of years ago, there were plans to add
interesting features for allowing users to share filters so that a new user
wouldn't have to train it from an empty database and one user's training could
affect other users' filters.

There were other projects around, but the above appeared to be the most
sophisticated.

So it's now 2013... Any changes? Comments?

Jason White

James Harper

Russell Coker

Craig Sanders

Jason White

Russell Coker

Jason White

trentbuck＠gmail.com

James Harper

Trent W. Buck

trentbuck＠gmail.com

Julien Goodwin

Julien Goodwin

Chris Samuel

tags

participants (9)