- What does Cisco have against Quebec?
- Attrition.org nails another nitwit
- Diary of a deliberately spammed housewife
- Seven cloud-computing security risks
- 20 great Windows open source projects
News | Newsletters | Podcasts | Chats | Opinions | RSS Feeds | This Week In Print | IT Careers | Community | Reports | Downloads | Slideshows | New Data Center
Partner Sites:App Performance | On Demand Security | Networking Solution | SOA | Value of WDS
Filtering spam messages is a thankless job for software.
For every 100 spam e-mails, one message usually gets through, an irritating pitch with links to Web sites selling questionable drugs or sketchy Rolexes.
The links contained within spam are one indicator in determining whether it should be blocked. Often after a large spam run, the addresses of spammy Web sites will be added to blocklists that are used by antispam software to cull future messages with those links.
To get around it, spammers construct e-mails with links that can't be identified by filters but still are valid in the messages, said Christopher Fuhrman, a professor of software engineering in the Department of Software and IT Engineering at the University of Quebec.
Spammers do this by "munging" the HTML -- adding backslashes, taking out tags -- so that the message and its links are still readable by the rendering engines of browsers or e-mail clients, but appear as a garble of nonsense to filters. The technique is also known as obfuscation.
It's a trial-and-error process, as spammers don't read HTML Web standards. "Spammers just want to get the cash," Fuhrman said.
Tamper with the HTML too much, and the message won't render at all. Too little, and filters snare the message.
So spammers aim for a narrow gap: Most browsers and e-mail clients can render a certain amount of munged HTML, although the tolerances vary depending on the application.
Fuhrman theorizes that spammers test their messages using Microsoft's widely used Outlook program, which uses the same HTML rendering engine as its Internet Explorer (IE) browser.
So Fuhrman and one of his graduate students, Hicham El Alami, are writing a program to use IE's rendering engine as a way to "parse" messages, or extract the links.
Services such as SpamCop already do this. SpamCop -- part of IronPort Systems, a subsidiary of Cisco -- has a Web-based service that uses algorithms to parse links out of spam messages submitted by users.
Those algorithms are hard to write, although SpamCop's is pretty good, Fuhrman said. Fuhrman and El Alami are interested in creating an alternate way to do that same parsing without needing to consistently tweak an algorithm to keep up with new tricks used by spammers.
IBM spent all that money on a mass rollout of PGP Whole Disk Encryption, just when its discovered that...- Anonymous
Comment