- Is the Cisco MARS mission going to abort?
- First iPhone worm spreads Rick Astley wallpaper
- 10 stunning 3D buildings made with Google SketchUp
- Open source software ready for big business
- Four reasons to buy (and one reason to avoid) the Droid
You may notice our numbers are not as optimistic as the marketing literature from vendors' products. There are four reasons for this:
1. Side effects from our test bed probably shaved a few points off of each product's ability to identify spam.
2. We were very strict in our definition of false positives. Because many of the false positives are mailing-list traffic of marginal use, end users often don't count them when reporting errors. Missing a few messages a month from a list that generates 10 a day doesn't bother them. This contributes to optimistic numbers that vendors report based on user experiences.
3. Because we ran our tests on more than 10,000 messages from a real-time mail stream, our results are more representative of real product response than canned or contrived tests from vendors. Even a few hours' delay in processing mail causes significant deviations in performance of some products.
4. Most vendors choose to report false-positive rates by dividing false positives by the total messages processed. No statistician would do that. Some vendors don't explain what they mean by "false-positive rate." We used statistics rigorously defined and agreed on by researchers, and it makes a dramatic difference. In our tests, computing false-positive rates the vendor way would cut the numbers in half. For a detailed look at the statistics involved, see "What makes a false positive."
Comment