Why our numbers work

You may notice our numbers are not as optimistic as the marketing literature from vendors' products. There are four reasons for this.

You may notice our numbers are not as optimistic as the marketing literature from vendors' products. There are four reasons for this:

1. Side effects from our test bed probably shaved a few points off of each product's ability to identify spam.


Main index: Spam in the Wild, The Sequel


2. We were very strict in our definition of false positives. Because many of the false positives are mailing-list traffic of marginal use, end users often don't count them when reporting errors. Missing a few messages a month from a list that generates 10 a day doesn't bother them. This contributes to optimistic numbers that vendors report based on user experiences.

3. Because we ran our tests on more than 10,000 messages from a real-time mail stream, our results are more representative of real product response than canned or contrived tests from vendors. Even a few hours' delay in processing mail causes significant deviations in performance of some products.

4. Most vendors choose to report false-positive rates by dividing false positives by the total messages processed. No statistician would do that. Some vendors don't explain what they mean by "false-positive rate." We used statistics rigorously defined and agreed on by researchers, and it makes a dramatic difference. In our tests, computing false-positive rates the vendor way would cut the numbers in half. For a detailed look at the statistics involved, see "What makes a false positive."

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2004 IDG Communications, Inc.

IT Salary Survey: The results are in