Americas

  • United States

Running the numbers on source verification

Opinion
Sep 01, 20033 mins
MalwareNetworking

We ended last week’s discussion about how to combat spam by mentioning a technique called source verification and readers who sent feedback about their experiences with a system I am experimenting with.

The way it works is, if your address isn’t known, my source verification system puts your message on hold and sends a note asking you to respond, the idea being that spammers operating out of temporary accounts can’t write back. When legitimate mail is returned the system releases the original message.

I’ve had some interesting questions about potential source verification drawbacks. Reader Bill Neuendorff commented: “Looks to me like source verification would almost triple the bandwidth gobbled by just the spam alone.”

Indeed, it does look like that on first blush. We are, after all, receiving something, sending something back out and then getting still more mail in the final phase. But the source verification process adds only about 10% to bandwidth overhead.

Let’s do the math.

Say your average user gets 100 30K-byte messages a day – that’s 3M bytes total. Assuming a spam ratio of 75%, that means 75 are spam while the other 25 are from legitimate sources.

Every piece of mail that doesn’t have a recognized address is challenged with, in the case of the system I’m using, a 3K-byte message.

So the system challenges all 75 pieces of spam. And presuming that 10% of the 25 legitimate pieces of mail are from new addresses, it also challenges about three real messages. That means the system generates 234K bytes of challenges.

Let’s presume the three legitimate senders respond and generate another 9K bytes of incoming traffic, while 10% of the spammers used real addresses and also respond, generating another 22.5K bytes of traffic. That adds up to about 32K bytes of returning mail.

Add that 32K bytes to the 234K bytes worth of challenges and you get a measly 266K bytes of challenge/response traffic, which is less than 9% of the 3M bytes of messages coming in per employee per day. And the source verification system is passing on only 10% of spam-a 90% spam reduction with no human interaction and no false positives.

Let’s look at this from a corporate perspective: In a 1,000-person company, source verification adds about 266M bytes worth of traffic to the 3G bytes of messages flowing in.

Now, let’s say you have a T-1 line that costs $600 per month, or roughly $20 per day. Given that in our hypothetical corporation we’ve assumed the actual messaging volume would be about 50% of the line’s capacity, source verification is only going to increase your costs from $10 to $11 per day.

Compare that increased cost with the value of reducing the productivity cost. For that same organization reducing spam by 90% cuts productivity costs from a total of around $2,250 per day to just $340 (see the spreadsheet for how to calculate the productivity cost – to get the latter number, change Line 6 to 10% and Line 14 to 55%). This makes the increased bandwidth cost as important as a rounding error in the coffee fund.

Given the cost of bandwidth compared with the cost of handling spam, I think source verification is a powerful and low-cost solution.

Of course, there are concerns. Reader Jim Becker wrote, “It may be a good idea to bring up the fact that these systems are not compatible with many electronic transactions that we use these days . . . many messages are server-produced in e-commerce, for example, and [challenge/response] systems will eliminate valid messages.”

Becker is right and an answer is to use “special” e-mail addresses for those transactions or add those sites to the source verification whitelist.

Bounce your thoughts to Backspin at gibbs.com.

mark_gibbs

Mark Gibbs is an author, journalist, and man of mystery. His writing for Network World is widely considered to be vastly underpaid. For more than 30 years, Gibbs has consulted, lectured, and authored numerous articles and books about networking, information technology, and the social and political issues surrounding them. His complete bio can be found at http://gibbs.com/mgbio

More from this author