A peek behind the curtain at Google’s best unkept secret

How the wizards of Mountain View create magical maps with hypercrowdsourcing.

What looked like it was going to be forty days and forty nights of rain in the northeast a couple weeks ago, put me in a circumstance that piqued my interest in Google Maps, specifically how they get those red, green and yellow (not brick) indicators of traffic to display on their road maps. I looked into it a bit and discovered hypercrowdsourcing.

While the nation was focused on the rise of the Red River in North Dakota, the Boston area rivers too were experiencing record high waters. It wasn't to the point of requiring a million sandbags, but we did have many roads flooded out (let alone basements) which caused some crazy traffic snarls for several days. On the first night of road closures, it took me 50 minutes to make the 6 minute trip across Lincoln, MA, the tiny rural village in which I reside. Stop-and-going through Lincoln, I glanced out of boredom at Google Maps on my iPhone and damned if the roads through town didn't come up red. How the heck did they do it?

In my last posting on crowdsourcing, I mentioned Tim O'Reilly's comments at the Open Source Business Conference. He talked about Google, quoting one of their execs as having said something like, "We don't have better algorithms, we just have more data."  Google Maps' traffic display feature is a great example.

How Google gets traffic data has come up in casual conversation before, yet not even one of my tech savvy friends has ever retorted with the answer. It seemed to me a mystery that's hung out there since the service started. Where do they get the data?

OK, here goes...they get it from us. I and hundreds of other folks crawling through Lincoln were unknowingly broadcasting our slowly changing GPS coordinates back to the GooglePlex. They were aggregating the data into a picture of what was happening on Concord Road and broadcasting it back as a red line wandering across our smarter-than-ever phones. Wow!

I'm christening this hypercrowdsourcing because of the quantity of participants contributing to the mass of data. Key to hypercrowdsourcing is that the crowd be able to easily provide the data. In the case of traffic data, you'd never get enough people to proactively report on a reliable basis for nothing in return. But make it literally a no brainer as Google has, and you've got the perfect data source.

There are, of course, privacy concerns. Google explains that all of the data are aggregated and anonymous, so there's nothing to worry about. Still you can opt out, or never opt in, if you're concerned.

This concept could be a harbinger of things to come, but I have to say, I'm scratching my head a bit to come up with other automated ways in which a company can collect useful data from people going about their daily business. Maybe that's why I don't work at Google. Do you have any cool ideas about how to enlist the Munchkin-like masses to provide the data for a valuable service? (If you have a really good idea, maybe you better just email it to me.) 

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2010 IDG Communications, Inc.

IT Salary Survey: The results are in