Google's new Flu Trends tool, which collects and analyzes search queries to predict flu outbreaks around the country, is raising concern with privacy groups.
Google uses search patterns to estimate flu activity
Google Tries to Show USA Flu Trends
The Electronic Privacy Information Center filed a Freedom of Information Act (FOIA) request asking federal officials to disclose how much user search data the company has recently transmitted to the Centers for Disease Control and Prevention, or CDC, as part of its Google Flu Trends effort.
Concern stems from what privacy groups claim is a disturbing lack of transparency surrounding the method Google is using to predict flu outbreaks. Google has publicly stated that all of the data used is made anonymous and aggregated, but there has been no independent verification of how search queries are used and transformed into data for Google Flu Trends, the privacy groups say.
"What we are basically saying is that if Google has found a way to ensure that aggregate search data cannot be used to re-identify the people who provided the search information, they should be transparent about that technique," said Marc Rotenberg, Electronic Privacy Information Center's president.
Rotenberg said the issue is important because the same techniques Google is using to predict flu outbreaks could be applied to tracking other diseases, including those that the urgency to contain the disease could be a whole lot greater with, such as SARS. "Let's say we have a spike in Detroit of SARS and the police say we want to know who in Detroit submitted those searches. How can Google ensure that this can't be done? The burden is on Google," Rotenberg said.
Google Flu Trends was publicly disclosed in November and has been described by the company as Web tool to help individuals and health care professionals obtain influenza-related activity estimates for all U.S. states up to two weeks faster than traditional government disease surveillance systems.
Google said in a blog post introducing Flu Trends last month that search queries such as "flu symptoms" tend to be very common during flu season each year. A comparison of the number of such queries with the actual number of people reporting flu-like symptoms shows a very close relationship, it said. As a result, tallying each day's flu-related searches in a particular geography allows the company to estimate how many people have a flu-like illness in that region.
In making the announcement, Google noted that it had shared results from Flu Trends with the Epidemiology and Prevention Branch of the Influenza Division at CDC during the last flu season and noticed a strong correlation between its own estimates and CDC's surveillance data based on actual reported cases. Google said that by making flu estimates available each day, Google Flu Trends could provide epidemiologists with an early-warning system for flu outbreaks.
Rotenberg said the service was potentially useful, but much depended on the kind of search data that Google is collecting and analyzing to make its predictions. Google has said that the database it uses for Flu Trends retains no identity information, IP addresses or any physical user locations. However, what is not clear is whether the company is completely deleting IP addresses, and if so, when it is doing it. Also, he said another issue was whether all Google is doing is anonymizing IP addresses by redacting some of the numbers in an IP string.