Blog spammers have found ways to automate inserting their unwanted messages into online conversations, but the few tools available to block them lag woefully behind.
CAMBRIDGE, MASS. – Blog spammers have found ways to automatically insert their unwanted messages into online conversations, and the few tools available to block them lag woefully behind.
“How far ahead [of us] are the spammers? Who knows,” said Jessica Baumgart, an affiliate with Harvard University’s Berkman Center for Internet and Society, who gave a presentation on blog spam at the MIT Spam Conference 2007 held last Friday.
“Any time we try to block them out, they find a way to get in. We’ll do something and five minutes later they’re back. It’s like playing chess.”
According to Baumgart, who has been involved with Harvard’s blogging initiative for seven years and manages tens of blogs on seven different platforms, there are three main ways spammers get their messages into blogs:
* Comment spam: Spammers are paid to surf the Web in search of blogs to manually type comments into, or write scripts to automatically enter the text. These can be hard to distinguish from legitimate entries, Baumgart said, except they’re often off the topic of the blog and include a link to a Web site.
* Trackback spam: Spammers develop scripts that use trackback links to place spam on blogs. A blog’s trackback feature lets readers automatically notify a site that they have linked to its pages. Trackback spam are links to random Web sites, many of which “are things you don’t necessarily want to see” as the blog host or participant, Baumgart said.
* Spam blogs, or splogs: Spammers take advantage of services such as Blogspot to set up free blogs that exist only to point visitors to Web sites. Not only are these sites annoying to visitors looking for legitimate information on a topic, Baumgart said, but they also pollute the results of search engines that index the sites.
There are some tools available to help blog hosts combat this unwanted, unrelated input. Some blog platforms include administration tools to block certain IP addresses from adding comments -- although Baumgart added spammers tend to use a range of IP addresses so blocking them one-by-one can become unfeasible. There’s also the no-follow link option, which is a command that can be embedded in HTML code that tells search engines indexing a blog not to consider a link legitimate, she said.
What would be helpful would be the use of image recognition -- often used on Web sites to verify the visitor is a human and not a computer by asking them to type in a word embedded in an image. But because some of the comment spam is generated by humans they could easily pass this test, Baumgart said.
Blog administrator tools today aren’t enough ammunition to fight off the mounting spam problem, leaving blog hosts helpless. One organization that Baumgart didn’t want to name has become so inundated with blog spam that it plans to pull the plug on its blog server and start over, losing all of the legitimate entries along with the spam. “The best thing to do is to shut down the server and just get rid of it,” she said.
A pair of executives with Six Apart, which develops blog platforms TypePad and Moveable Type, also spoke at the conference but painted a less dire picture.
Blog spam is indeed a problem, according to Six Apart CTO Aaron Emigh, but it isn’t a completely new problem.
“Blogging has become a very mainstream publishing platform, if you want to reach a lot of people online, leaving [information] on a blog is a very good way to do it,” he said. “But we are sufficiently ahead of [blog] spammers thanks to all the work that’s been done” to fight e-mail spam.