Network World
Thursday, July 24, 2008
DNSstuff.com
Get information about your IP
IP Information
50+ On-demand DNS and network tools

Buzzblog

Why can't Digg's algorithm do something about rampant teenage sex?

If you don't care about the inner workings of social bookmarking/news aggregation sites - Digg, in particular - I'd suggest that you move along; nothing to see here.

However, the Diggers among you - and I'm told that's 1 in 5 of NetworkWorld.com readers - are aware that recent changes in the Digg submission-ranking algorithm has caused great consternation among long-time Diggers, who believe they are being unfairly punished for having gotten good at Digg's game, a game that has attracted multimillion-dollar acquisition interest from the likes of Google and Microsoft.

Not my fight, but it seems to me there is another problem - one directly relevant to the bigger Digg brouhaha - that Digg software engineers might address to make that site more useful to readers and more fair to these already-irritated regular content submitters: They should do something about all these sexually transmitted diseases among teenage girls.

As I type (as opposed to when you'll read), here are five of the nine top "Hot in Health" stories as ranked on Digg's page devoted to health issues ... and, yes, they're all duplicate versions of the exact same news story:

1 in 4 Teenage Girls Has a Sexually Transmitted Disease

Study: 1 in 4 teen girls has an STD

1 in 4 teen girls has sexual disease

STDs rife among US teenage girls

Tainted Love: 1 in 4 teenage girls in USA has STD

Now, I know enough about the Digg demographic to appreciate the fact that any combination of the words "teenagers" and "sex" is a sure-fire draw, but Digg's rewarding of this obsession seems to be a bug not a feature. The reason that this should matter to Diggers is that the "Hot in ..." lists for each subject area are a key means of exposure needed to make any given story "popular" - in other words, to have it hit Digg's coveted front page. If these "Hot in ..." lists are jammed with multiple copies of a single story, the odds of any particular submission making it all the way upstream to the site's front page become even longer.

And, while I'm not software engineer, it would seem that two factors about Digg might indicate that this single-topic dominance is fixable: 1) every submission is time-stamped, so giving preference to first and/or earlier entries should be easy and 2) the site already has an elaborate duplicate-determination system, which while largely ignored by submitters seemingly could be used to impose just a measure of discipline in the interest of providing better "Hot in ..." lists.

In other words, some combination of the two would seem capable of staunching this epidemic of teenage sex.

Welcome regulars and passersby. Here are a few more recent Buzzblog items. And, if you'd like to receive Buzzblog via e-mail newsletter, here's where to sign up.

The 1 in 10 Brits hurt 'texting while walking' will now find comfort in padded lampposts

Just using Facebook gets this guy dragged into Wikileaks case.

In defense of Caller-ID spoofing.

Google says EFF's barking up an empty tree.

Stallman on handing over GNU Emacs, its future and the importance of nomenclature.

Call "retail renting" what it is: short-term theft.

Google renames the Persian Gulf.

Get $500 just for going on a job interview. (No, really.)

Top 10 Buzzblog posts for '07: Verizon's there, of course, along with Gates, Wikipedia and the guy who lost a girlfriend to Blackberry's blackout.

8 can't-miss tech predictions ... for 1998

You are a moron

Useful answer?
0

This does not prove that Digg is obsessed with teenage sex. Click that link again, you'll see nothing to do with teenage sex. At best you could say that the algorithm is a little off. But I think your headline might be a little misleading or attention grabbing.

As for your theories on the algorithm, just because something is timestamped first doesn't mean it should automatically make it to the front page. It could be that the first digged was actually a rip off of another article that just happened to be digged a few minutes later. Or it could be that another article gets more diggs that the first one on the topic.

If you have total duplicate detection, you won't be able to see trends in the news. Take the current elections section. Every story is something about Obama and Hillary. A very tight duplicate detector would probably reject about 50% of submissions. But ultimately, the best quality ones are the ones that make it to the front page.

Moral of the story is that algorithms are difficult. You have trade-offs. Can't keep everyone happy all of the time.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

About Buzzblog

When not blogging, I am a Network World news editor and write the 'Net Buzz column.

RSS feed

Contact me.

Buzzblog archive.

Advertisement: