Machine-learning, social media data help spot flooded areas

A new machine-learning algorithm, combined with social media and remote sensing, has successfully identified flooded areas without using satellites alone.

Twitter and Flickr, along with remote sensor data, can be used to identify flooded areas, a team of university researchers say.

It's faster than using publicly available satellite images on their own. That imaging can sometimes take days to become available, the researchers say. It's also easier to identify the flooded streets.

Algorithms are the key to making all the data work together, the scientists reckon. A computer can learn what is and what isn't water in a flood, for example.

It does it by analyzing publicly posted images and thousands of public tweets and posts generated during incidents in urban flooding situations. Satellite analysis, the former method, becomes secondary.


As an experiment, the team of scientists from Penn State, the University of Wisconsin, and other groups, analyzed 2013 flooding in Colorado and found 150,000 tweets from people affected.

They then processed those tweets with an existing tool called CarbonScanner and found "clusters of posts," they say in their press release on Penn State's website.

CarbonScanner analyzes tweet hashtags and matches their locations onto a map. Those clusters implied damage.


The team then looked at over 22,000 images from around the area with another tool.

This was one that they had developed themselves. It uses a "machine learning algorithm that automatically analyzes several thousand images," the website says.

"It allowed them to quickly identify individual pixels in images that contained water," the report continues.


The raw imagery that they used was "obtained through satellites, Twitter, Flickr, the Civil Air Patrol, unmanned aerial vehicles and other sources," they say.

The computer successfully figured out where there was water.


"We looked at a set of images and manually selected areas that we knew had water and areas that had no water" in writing the algorithm, says Elena Sava, one of the graduate students.

"Then, we fed that information to the algorithm we had developed, and it allowed the computer to 'learn' what was and wasn't water," she added.


The names of rivers and streets in the tweets, along with remarks related to how the individual tweeter couldn't get home, and so on, were giveaways of flooding found in the social media data.

That was combined with the patches of water discovered with the machine learning algorithm. The result was better than simple satellite data, the researchers think.

Satellite didn't show floods

But it's not just because it could be produced in a more timely manner.

"If you look at satellite imagery, downtown Boulder showed very little flooding," said one of the professors quoted on the website.

"However, by analyzing Flickr and Twitter data, we could find several cues that many areas were underwater," he says. The combination produced the results.

Weather in 2013 produced 17 inches of rain over nine days in parts of Boulder—almost a year's worth, the press release says.

Storm Jonas

The Penn State et al studies may just be the beginning of the use of common Internet use—such as their Twitter-captured data—in future real-time analysis of disasters.

Interestingly, during monster snow storm Jonas, while most traffic remained the same, communications app FaceTime traffic, throughout last Saturday, was double what it was on the previous non-storm weekend, according to Sandvine, an Internet traffic analyst.

There's intelligence in that traffic spike alone.


Copyright © 2016 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022