
Jason White <jason@jasonjgw.net> writes:
CRM114: this is what I am currently using for my incoming mail. It appears to be in the midst of a rewrite as a library with support for various scripting languages. My initial experiences with it weren't good, but I tried it again several years ago and, this time, it quickly surpassed SpamAssassin when trained to classify my mail.
[Not really helpful for your question, but I'll brain-dump what I have.] This is what we've been using. Because we have mutt users, and mutt doesn't implement IMAP COPY, it's impossible to trigger it via dovecot hook. So we run it nightly and pass it find -type f -mtime 3 or thereabouts (so that users have a couple of days to classify the message). It was working MUCH worse before I changed -mtime +3 to -mtime 3, because in the old case it was retraining every day on old emails, so it ended up being VERY VERY certain about things it shouldn't. I don't remember why we picked crm114, but we're deliberately using it only for managers -- the engineers don't get spam in the first place, so it avoids having to piss about checking =Spam occasionally for false positives. RSN I'll write down the various pre-body things we do to reduce spam. We're also running it on an Ubuntu 10.04 stack, so we're not up to speed with recent developments. :-)