Do these numbers work for you?
By
Joel Snyder
,
Network World
, 09/15/2003
- Share/Email
- Tweet This
- Print
Testing anti-spam products fairly is not easy. We ran the best real-world test we could, and we think that it's a better test
than has ever been run before. However, there are some very valid complaints about our methodology. The good news is that
our numbers are under-reporting how well these products work. In an enterprise environment, you probably would see fewer false
positives and a better spam reduction for almost all of these products.
The first part of our test that will give sub-optimal results is that the IP address of the sending system is not available
to the spam gateway. Because we had to re-transmit all of the spam to all participants at once, all the messages appear to
come from one place: our mail servers. Most of the products include some heuristics that are based on the IP address of the
spam sender. These include blacklists and other statistical measures. In our test, all these features had to be disabled.
Our own tests using the MAPS RBL+ list (www.mail-abuse.org) show that a well-run list will give approximately a 10% to 20% reduction in unwanted e-mail with a very low false-positive
count. Other online lists have more draconian policies and will give a higher reduction in spam but also have a higher false-positive
count.
The second under-reporting in our test is because of the lack of a feedback cycle. Our top-scoring products all let individual
users manage their own settings, quarantines, blacklists and whitelists. Normally, a user might run with one of these anti-spam
products for a few weeks and use that time to tune his settings and, more importantly, his whitelist. Once the whitelist has
the most important correspondents in it, the settings can be turned to be more aggressive, filtering more spam with a lower
false-positive count.
Some anti-spam products are customized so tightly that they won't even work properly without significant customization. Most
of these are client-side products. Systems using Bayesian filtering work very well in domain-specific environments, but at
the cost of having the user periodically train and re-train the system using both "good" and "unwanted" e-mail. They also
redefine what is meant by spam. A well-trained filter will mark as unwanted e-mail from a mailing list that might be off-topic
or on a sub-topic not interesting to the recipient. Spam is no longer just unwanted commercial bulk mail but can be anything
that the recipient isn't interested in. No vendor submitted any server-side products that require this level of training.
Comment