Test: Spam in the wild

We throw real traffic at 16 anti-spam products.

We test 16 anti-spam products on a live production network to see which could live up to the claim of solving your spam woes.

Practically every vendor on the planet claims to be able to solve your spam woes. So we tested 16 products on a live production network to see who could back those claims. For the entire month of June, we threw a live mail stream, spam and all, at the products to see who could survive the spam onslaught, and who would choke.

Estimates of the amount of unwanted e-mail range from 40% to 75%, but we can give you an exact percentage - 69%. That's how much spam we saw during the month of June. And things are getting worse, not better - in a similar Network World test we ran in February (see review), only 50% of the mail stream was spam.

Fortunately, good products prevailed, and can help you significantly reduce your spam problem. With a very broad field, including service-based mail filters, appliances and traditional software on Unix and Windows, network managers should be able to solve their spam problem with a minimum of disruption - to the accolades of their users.

How well do they work?

We tested mail-filtering gateways by feeding them an e-mail stream in real time, as it came into our labs (see "How we did it"). Each product received two scores. The first score, sensitivity, measures how well the filter identified spam. A perfect score would be 100%. The second score is the false-positive rate, the ability of the filter to make sure that non-spam messages do not get tagged as spam. A perfect false-positive rate would be 0%. (For more about the different ways to measure a spam filter, see "Spam and statistics".)

Spam filtering is such that a high sensitivity naturally also would have a high false-positive rate. Similarly, a low false-positive rate might let a lot of spam through. We feel that enterprise network managers would be more concerned with false positives, so we asked the vendors to tune their products for a false-positive rate of about 1%.

Products from seven companies - CloudmarkCorvigoMailFrontierMX LogicPostiniTrend Micro and Tumbleweed Communications  - met our 1% requirement.

To identify the top products in filtering spam, we looked for a sensitivity rate of at least 80%. Products from seven companies also met that level - ActiveState, Cloudmark, Computer Mail Services, MailFrontier, Postini, Singlefin  and Tumbleweed. (For complete results, see graphic.)

Combining these lists gives us the top overall performers: Cloudmark's Authority, MailFrontier's Anti-Spam Gateway (ASG), Postini's Perimeter Manager and Tumbleweed's Messaging Management System (MMS).

Accuracy and False Positives

Of course, your results will vary, depending on your own message-stream characteristics and how well you tune the products. For example, Postini's spam-detection engine is at the heart of Trend Micro's recently released Spam Prevention Service (SPS). However, we got very different results with the two products, largely because Postini officials told us to tune their product using one set of numbers, while the Trend Micro team gave us a different set. This resulted in both a higher false-positive rate and lower spam sensitivity for Trend Micro.

Many vendors predicted that their false-positive rate would be much lower than 1%. Corvigo's CTO said its customers report a false-positive rate between 10 and 100 times better than our tests showed. That's easy to understand, because most of the false positives we saw fell into the category of "mail that wouldn't be missed by users," such as news stories forwarded by friends, e-mail from online merchants and postings to mailing lists. For example, Postini, which had the lowest false-positive rate, missed 28 messages it marked as spam. Of those, only five were messages we wanted to see. If we hadn't been combing our mail carefully, we wouldn't have noticed those messages as missing.

Some false positives were understandable, but regrettable. A message with the subject line "IOS fw guru" looked like spam to many of the filters, but turned out to be a job offer for our test lab.

Tuning to improve performance

Most of the products can be tuned to increase sensitivity and decrease false positives, but how this tuning is accomplished and who is responsible for it makes all the difference. There are two main tools used in tuning mail filters. First is the threshold that determines whether a message is spam. The best products offered a series of levels, often expressed as percentages, based on its guess as to whether a message is spam.

Anti-spam performance

Cloudmark's Authority is an excellent example of this. Each message passing through the system is assigned a number from 1 to 99, indicating Authority's confidence that the message is spam. The higher the number, the more likely a message is spam. The system manager picks actions based on these thresholds. If a message gets a 99 (very likely to be spam), then the message is dropped. At thresholds of 88 or higher, the message is probably spam, and the system manager might select to add "[SPAM]" to the subject line before passing it on to the end user.

In a corporation, using different thresholds and creating different actions for different users are important for a successful implementation. The products with the best control features were Corvigo's MailGate and Postini's Perimeter Manager. With Perimeter Manager, an end user can log on to the spam management interface at any time and adjust his settings up or down as needed. Vircom's modusGate also supported per-user settings, but only under the control of the system administrator. Many other products included per-domain or per-group settings, all controlled by the network manager. If you can divide your users into domains, this feature might work well for you. These include MX Logic's Email Threat Management Service, ActiveState's PureMessage, SurfControl's E-Mail Filter, EasyLink's  MailWatch and Tumbleweed's MMS. This doesn't mean that some of the other products couldn't do per-domain settings, it's just that we found configuring this feature with their interfaces so clumsy that it obviously wasn't part of the product design.

Black and white (lists)

A corollary to spam thresholds is the management of whitelist and blacklist membership. Of these, whitelists are the most important - the list of senders that always should be passed through and never considered spam.

Blacklists are the opposite - everything they send is considered spam. The theory is that an aggressive spam filter will show fewer false positives and higher sensitivity if it has a good whitelist from which to operate. The products we tested treated the whitelist with differing degrees of importance. Several have automatic whitelisting as an integral part of their product - the system adds an address to the user's whitelist or the company's whitelist as soon as the user sends a message. MailFrontier's ASG does this by monitoring the logs of your outgoing mail server, while Corvigo's MailGate and GFI's  MailEssentials would do this if you used their products as your outgoing mail relay.

But automatic whitelisting has its own dangers: If a user turns on a "on vacation" auto-reply message, they'll add every spammer who sends them a message to the whitelist. MailFrontier's team responded to this issue by telling us that people in corporations never use vacation messages anymore. We think that's crazy and strongly disagree.

Many products supported per-user whitelist/blacklist settings under user control, including Corvigo's MailGate, MailFrontier's ASG, MX Logic's Email Threat Management Service, Postini's Perimeter Manager and Singlefin's E-mail Protection Services. For Singlefin, this is a critical feature: Because its spam filter has no threshold tuning available, whitelists and blacklists are the only tools available to improve performance. With Singlefin's very high false-positive rate, only a comprehensive whitelist would make this perform acceptably.

All the other products supported per-system or per-group blacklist and whitelists, although sometimes the facilities for this feature were clumsy.

Not all whitelists are just for "allowed senders." Some products, including MailWatch, MMS, modusGate, Praetor, PureMessage and SurfControl, let you create a whitelist on elements such as message content. For example, a company might want to whitelist messages that have one of their product names in the body of the message, especially if these are sent to the sales team.

Three products failed to provide adequate tuning facilities that might have fixed their poor performances. Clearswift's  Mailsweeper 4.3 (a new version was released in August and did not make our test) has no ability to tune spam thresholds - something either is or is not spam, yet the built-in rules caught less than half the spam it could have, with a false-positive rate eight times higher than the best products. Both GFI's MailEssentials and Computer Mail Services' Praetor used the same technique: If bad words or phrases appear even once in a message, it must be spam.

This simplistic approach to filtering spam just won't work, as demonstrated by the very high number of false positives MailEssentials and Praetor had. Both products allowed for a network manager to edit the rules to filter spam, but the inherent limitations of this technique is that the manager would be tuning forever. With so many other products that work better out of the box, why would you want to tune forever? GFI acknowledges the limitations of its current approach. The company is adding Bayesian filtering to Version 9 of its product, to be released later this year, which should improve overall performance. One product submitted for this test, Gordano  Messaging Server, was dropped because it not only uses the same poor algorithm as MailEssentials and Praetor, but it also comes with an empty rule set, and the vendor doesn't even provide an initial guess at a word list.

What happens to spam?

The next area where products quickly differentiated themselves was in their ability to manage the spam once it was identified. All the products, except EasyLink's MailWatch, can tag spam, typically by adding a string to the subject line (such as "[SPAM]") or adding a header to the message (such as "X-Spam: yes") or both. This method is the lowest level of spam catching because it means the message still hits the corporate mail server and has to be identified and managed by the end user. Although all modern mail clients can catch and segregate messages into folders (or directly delete) when tagged, we think large enterprise networks will want to keep spam as far from their e-mail servers as possible.

Blocking the message at the anti-spam gateway means having a quarantine facility. As the filter identifies spam, it goes into the quarantine instead of the user's mailbox.

Corvigo, MailFrontier, MX Logic, Postini and Singlefin all offered a per-user quarantine, which the user can manage via a Web portal. With a few clicks, a user can see his quarantined messages, release false positives and immediately add the sender to his whitelist. This is a very scalable way to handle spam and gives the user maximum control and flexibility.

Three other products, ActiveState's PureMessage, Cloudmark's Authority and Vircom's modusGate, used a mail-based quarantine: Users get periodic notification of their spam in an e-mail, and they can click or reply to the message to release messages for delivery. This isn't as clean or desirable an approach. For example, we found things such as license keys often are marked as spam because they are short messages with lots of nonsense words in them. If you had to wait a day for the quarantine notification to show up so you could release that one urgent license key you've been waiting for, it wouldn't be a happy day.

Of course, not every company needs a per-user quarantine. If you don't care whether messages get onto your mail servers, you can tag the spam and make users responsible for receiving and managing their own spam. Other products have per-system or per-group quarantine settings. The theory is that this lets a full-time e-mail administrator paw through the quarantine and take the burden away from end users. The problem with this is that it doesn't work - our own experiment in picking through every message showed us just how difficult it can be to decide whether a message is spam. But in some scenarios (such as small offices or environments where e-mail is tightly controlled), these approaches might work.

Different types of anti-spam

We also discovered that there are two major kinds of products in the anti-spam game today. Some are custom-built and aimed at spam filtering, such as MailFrontier's ASG, Cloudmark's Authority, Trend Micro's SPS and Corvigo's MailGate. Others come at spam filtering from the general policy enforcement, content management and mail firewall side of the house (GFI's MailEssentials, Clearswift's Mailsweeper, and Tumbleweed's MMS fit into this category).

One of the many differences in these products is in the available set of actions for messages. For example, Trend Micro's SPS has only two possible actions: Add something to the subject line, or redirect the message to some other mailbox (or both). However, if you bundle SPS with Trend Micro's older InterScan Messaging Security Suite, a popular mail content-management system, you get five more things you can do with e-mail, including a systemwide quarantine or simple deletion.

1 2 Page 1
Page 1 of 2
The 10 most powerful companies in enterprise networking 2022