Adventures in spam testing
By
Joel Snyder
,
Network World
, 12/20/2004
- Share/Email
- Tweet This
- Print
Testing routers and switches is easy. Frames go in, frames come out. With anti-spam products, nothing is ever easy.
We got into more shouting matches over this test than any other - and that was even before we published the results. Vendors
are intensely competitive, and the numbers are hard to come by. We worked hard to create a fair test, but that doesn't mean
every product will show its best side. For our complete methodology, click here.
Main index: Spam in the Wild, The Sequel
The biggest sticking point was being the first hop. Anti-spam vendors have learned they can eliminate a huge pile of junk
right off the top by using a variety of blacklist techniques. The best products can do that wherever they are in the chain
by looking at headers in the message. But a surprisingly large percentage haven't figured out how to cope with not being the
top dog in the e-mail chain. Some also detect irregularities in the SMTP conversation, signs of some spam-generator tools. Our test bed probably shaved a few percentage points off the best possible
spam catch scores.
We also had to deal with flaky anti-spam products. For several reasons, not every product was ready to immediately accept
every message the moment we received it. To deal with this, we had to have a real SMTP Message Transfer Agent (MTA) receive
and retransmit the products. That meant some of the tracks and traces of spammers that might be in irregular or improperly
created messages were obstructed by our MTA.
A bigger issue in testing many products involved training. While some products - including several of our top finishers -
require no training, others asked for various degrees of pre-test preparation. In the worst case, several vendors asked us
to identify false positives and false negatives during a training period before testing. While we followed all the instructions
on tuning, the sheer number of products limited the amount of time we could spend on this task for each product. Vendors whose
products require significant tuning will argue they would leapfrog to the top of the list with more tuning time. But maybe
they wouldn't.
Several products also depend on environmental information to help them make better decisions. For example, if you send your
outbound mail stream through the anti-spam gateway, it knows who to expect responses from, and can reduce the false-positive
rate while increasing spam-catch rate. Our test bed didn't permit this type of configuration.
The false-positive and false-negative rates we found are useful for comparing products but a real installation will likely
have a lower false-positive rate and higher spam-catch rate. Because every product was handicapped in the same way, the results
reported give an excellent way to compare the performance of products. Comparing these statistics across tests, though, would
not give valid results.
Comment