Hadoop analysis now tackling IoT to improve transit

For the Internet of Things, collecting data is only half the battle, and public transit agencies know that as well as anyone.

Operators of bus, train and trolley systems collect a wide range of data types from ticket machines, fare validation devices, GPS (Global Positioning System) and other sources, but typically that data stays in silos, according to Wade Rosado, director of analytics for Urban Insights, a consultancy launched on Monday to help make use of transit data.

Urban Insights is using a cloud-based Apache Hadoop system to crunch numbers from transit agencies that want to glean new insights about their operations. The company, a wholly owned subsidiary of transit technology vendor Cubic Transportation Systems, can provide both periodic consulting services, such as producing annual reviews, and ongoing measurements as frequently as every day, Rosado said.

Transit is a classic example of the kind of business that may be transformed by IoT. It has a lot of moving parts, serves constantly changing rider needs and faces demands for efficiency. Transit also is built around specialized infrastructure that stays in place for years or even decades, something it holds in common with power transmission, logistics and other IoT hotspots.

There’s a lot of useful data generated in a transit system’s day, Rosado said. Buses may check in with a GPS location every 30 seconds, sensors on doors can count how many riders get on and off, and payment systems know about ticket purchases and validations.

“They do absolutely collect this data, and they use it within the constraints of the system that was designed to report on it, but where the weakness exists is in integrating those data sources to paint a more complete picture of what’s actually happening,” Rosado said.

For example, Urban Insights is helping the San Diego Municipal Transportation System (MTS) to figure out whether its routes and schedules match most riders’ needs. Existing systems collected data about individual trips on buses or trolleys, so Urban Insights combined data from five separate sources to understand riders’ overall journeys, including transfers, and whether service was aligned with real demand, he said.

Urban Insights creates custom data models, takes in raw data from customers’ systems and reformats it, and then integrates it with data from other sources. “Hadoop lends itself very well to that, because it’s able to do complex operations on very large data sets,” Rosado said.

Urban Insights, based in Washington, D.C., is running its service on a combination of its own servers and virtual private cloud resources through services like Microsoft Azure, Rosado said. That lets the consultancy burst to greater capacity when needed, he said.

The transit industry is moving toward greater integration, but that’s easiest for large cities, Rosado said. For example, Cubic Transportation Systems built a platform for the Chicago Transit Authority that combines CRM (customer relationship management), financial settlements, fair payments, cash management and other functions. But smaller agencies are often limited to off-the-shelf products that don’t bring as much data together in one place, he said.

There are also emerging standards that span different types of data, such as GTFS (General Transit Feed Specification), a common format for public transit schedules and location information that can be used with the Google Transit planning tool. But it’s still early days, Rosado said.

Insider Shootout: Best security tools for small business
Join the discussion
Be the first to comment on this article. Our Commenting Policies