Spam Filters

Robbert Haarman

2010-12-11

Introduction

This is a comparison of SPAM filters which I ran in April 2005, back when Mailvisa was young. Mailvisa has improved since then, as, no doubt, have the other filters. I have been thinking about running another comparison, but I haven't gotten around to it yet.

The filters were trained on the contents of my inbox and my spam box. The inbox contained 2655 good messages and no spam, the spam box contained 2189 spam messages and no good ones.

The tests were run on a Pentium MMX 200 MHz with 96 MB of RAM, running OpenBSD 3.6.

Annoyance Filter

About Annoyance Filter

From the manpage:

annoyance-filter uses Bayesian statistics to determine the probability an E-mail message is junk based on an analysis of its contents compared to collections of known junk and legitimate E-mail.

The current version of this program is always posted at: http://www.fourmilab.ch/annoyance-filter/.

Training

Annoyance Filter was trained by running annoyance-filter -j ~/mail/SPAM/cur -m ~/mail/Inglorion/cur --prune --write .annoyance-filter/dict.bin. This took 526.11s user, 9.96s system, and 536.07s total time, after which annoyance-filter ran out of memory.

Classifying Ham

No classification was done, as the training did not succeed.

Classifying Spam

No classification was done, as the training did not succeed.

Bogofilter

About Bogofilter

Bogofilter is a fast Bayesian spam filter, based on Paul Graham's "A Plan For Spam". It is intended to be used on sites that process a lot of mail.

It depends on the Berkely database (the OpenBSD port gave me choice of version 3 or 4).

Training

Bogofilter was trained by running bogofilter -n ~/mail/SPAM/cur, followed by bogofilter -s ~/mail/Inglorion/cur. The first command took 29.02s user time and 2.41s system time, for a total of 31.43s. The second command cost 391.39s user time and 5.80s system time, for a total of 397.19s. The total training time for Bogofilter was 428.62s.

Classifying Ham

Classification of ham was done by piping each message in ~/mail/Inglorion/cur to bogofilter. This required 427.28s user time and 107.53s system time, total 534.81s.

No messages were falsely classified as spam. This corresponds to a precision of 100%.

Classifying Spam

Classification of spam was done by piping each message in ~/mail/SPAM/cur to bogofilter. This required time of 58.22s user time and 86.86s system time, total 145.08s.

45 messages were wrongly classified as ham. This corresponds to a recall of 98%.

ifile

About ifile

ifile is a mail classification program written in C. It sorts mail into any number of mailboxes. For this comparison, a mailbox named `ham' and a mailbox named `spam' were used.

Training

Training was performed by running ifile -i spam ~/mail/SPAM/cur/*, followed by ifile -i ham ~/mail/Inglorion/cur/*. The CPU usage statistics for the former were 60.51s user, 2.98s system, 63.49s total. The latter command did not complete due to a memory fault.

Classifying Ham

No classification was performed, as the training did not complete successfully.

Classifying Spam

No classification was performed, as the training did not complete successfully.

Mailvisa

About Mailvisa

Mailvisa is primarily intended as a testbed for different ways to calculate the probability that a message is spam. It uses Bayesian filtering, after Paul Graham's A Plan For Spam.

Mailvisa is written in Ruby, and requires at least ruby 1.8 to run.

Training

Mailvisa was trained by running mailvisa add bad ~/mail/SPAM/cur/*, followed by mailvisa add good ~/mail/Inglorion/cur/*, and finally mailvisa calculate The first command took 205.91s user time and 2.20s system time, for a total of 208.11s. The second command cost 1344.38s user time and 7.65s system time, for a total of 1352.03s. The calculation of the scores consumed 80.07s user and 0.62s system time; 80.69s in total. The total training time for Mailvisa was 1640.83s.

Classifying Ham

Classification of ham was done by piping each message in ~/mail/Inglorion/cur to mailvisa check -q. The CPU usages were 1525s for the daemon, and 222.22s user, 146.61s system (368.83s total) for the front end. The combined total CPU usage was 1893.83s.

No messages were wrongly classified as spam. This corresponds to a precision of 100%.

Classifying Spam

Classification of ham was done by piping each message in ~/mail/Inglorion/cur to mailvisa check -q. The CPU usages 1073.25s total for the daemon, and 183.80s user, 121.30s system, 305.10s total for the front end. The combined total CPU usage was 1378.35s.

429 messages were wrongly classified as ham. This corresponds to a recall of 80%.

Note: measurements on later versions of Mailvisa have shown recall percentages of up to 96%.

Conclusions

Out of the filters tested, Bogofilter is the best by far. While neither Bogofilter nor Mailvisa yielded any false positives, Bogofilter did 12% better at correctly classifying spam, and outperformed Mailvisa by almost a factor 4 in training speed, and almost a factor 5 in filtering.

Unfortunately, the memory constrained on the machine used for testing prevented some of the contenders from running. This comparison should be repeated on a machine with more memory and using later versions of the participating spam filters.

Note: I have since performed tests on a more modern machine running Debian 3.1, but, alas, I lost the data for these tests. However, I remember that Annoyance Filter ran slower than Bogofilter and Mailvisa, and ifile, while fast, was very bad at classifying messages. Bogofilter scored a recall of 99%, Mailvisa 93%. Neither had any false positives.