Algorithm predicts Twitter 'trending' topics hours in advance

MIT professor, student say algo may also have weightier applications


Twitter's "trending" feature is a guilty pleasure of mine, I will confess.

It's also useful to me professionally in terms of helping to identify the absolute hottest of today's hot topics (news, gossip and hashtag nonsense) some of which can be shoe-horned into a technology blog when the blogger is so inclined. So the idea that an MIT professor and one of his students have devised an algorithm they say is capable of predicting which topics will "trend" on Twitter grabs my attention on both a personal a professional level.

From an MIT press release:

Twitter's home page features a regularly updated list of topics that are "trending," meaning that tweets about them have suddenly exploded in volume. A position on the list is highly coveted as a source of free publicity, but the selection of topics is automatic, based on a proprietary algorithm that factors in both the number of tweets and recent increases in that number.

At the Interdisciplinary Workshop on Information and Decision in Social Networks at MIT in November, Associate Professor Devavrat Shah and his student, Stanislav Nikolov, will present a new algorithm that can, with 95 percent accuracy, predict which topics will trend an average of an hour and a half before Twitter's algorithm puts them on the list - and sometimes as much as four or five hours before.

The algorithm could be of great interest to Twitter, which could charge a premium for ads linked to popular topics, but it also represents a new approach to statistical analysis that could, in theory, apply to any quantity that varies over time: the duration of a bus ride, ticket sales for films, maybe even stock prices.

Like all machine-learning algorithms, Shah and Nikolov's needs to be "trained": it combs through data in a sample set - in this case, data about topics that previously did and did not trend - and tries to find meaningful patterns. What distinguishes it is that it's nonparametric, meaning that it makes no assumptions about the shape of patterns.

You can read more of the details here:

I'd like to see this thing prognosticate in live action.

After all, even without an algorithm, I can predict with 100 percent accuracy (or close to it) that the death of a Whitney Houston or Michael Jackson will "trend" on Twitter; same for natural disasters, presidential debates and must-see televised sporting events.

But would this MIT algo also have anticipated Chuck Tanner's appearance on the list? I mean you have to be a fairly serious baseball fan to know that the former manager's death in February 2011 would move a whole lot of Pittsburgh Pirates fans to tweet their sympathies. (Yet when I saw the man's name "trending" on Twitter, I knew, unfortunately, that it was not good news.)

And will this algorithm be able to sniff out hours in advance what may be the most notorious type of "trending" topic on Twitter: the fake celebrity death.

If the MIT duo and their algorithm can somehow help stop that annoyance, they will have done the Twitter community - and celebrities - a significant service.

Welcome regulars and passersby. Here are a few more recent buzzblog items. And, if you’d like to receive Buzzblog via e-mail newsletter, here’s where to sign up. You can follow me on Twitter here and on Google+ here.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2012 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)