Federated learning improves how AI data is managed, thwarts data leakage

Researchers from Penn Medicine are studying new AI models using federated learning to improve how brain tumors are detected and treated.

healthcare technology / medical data
Metamorworks / Getty Images

Privacy is one of the big holdups to a world of ubiquitous, seamless data-sharing for artificial intelligence-driven learning. In an ideal world, massive quantities of data, such as medical imaging scans, could be shared openly across the globe so that machine learning algorithms can gain experience from a broad range of data sets. The more data shared, the better the outcomes.

That generally doesn't happen now, including in the medical world, where privacy is paramount. For the most part, medical image scans, such as brain MRIs, stay at the institution level for analysis. The result is then shared, but not the original patient scan data.

Researchers believe a shift in the way data is managed could allow more information to reach learning algorithms outside of a single institution, which could benefit the entire system. Penn Medicine researchers propose using a technique called federated learning that would allow users to train an algorithm across multiple decentralized data sources without having to actually exchange the data sets.

Federated learning works by training an algorithm across many decentralized edge devices, as opposed running an analysis on data uploaded to one server.

"The more data the computational model sees, the better it learns the problem, and the better it can address the question that it was designed to answer," said Spyridon Bakas, an instructor in the Perelman School of Medicine at the University of Pennsylvania, in a press release. Bakas is lead author of a study on the use of federated learning in medicine that was published in the journal Scientific Reports. "Traditionally, machine learning has used data from a single institution, and then it became apparent that those models do not perform or generalize well on data from other institutions," Bakas said.

The Penn Medicine study focuses on the use of federated learning to design an AI system that will help clinicians better identify and treat brain tumors by sharing brain MRIs.

The problem right now, according to the researchers, is that all that useful sample data is held privately by the institution that collected it. It is analyzed locally by that institution, where a model is created. Each model can be then worked on by other institutions, but it's not ideal, because the local scenarios are all different.

A better way to do it, using federated AI, is to create a model—a brain tumor detecting model, for example—then share that model with hospitals globally. Instead of sharing data among institutions, the training model is distributed to the different data owners.

"A model trained at Penn Medicine, for example, can be distributed to hospitals around the world. Doctors can then train on top of this shared model, by inputting their own patient brain scans. Their new model will then be transferred to a centralized server. The models will eventually be reconciled into a consensus model that has gained knowledge from each of the hospitals, and is therefore clinically useful," the group explains.

Conceivably, hospitals around the world could participate if patient data is protected, privacy concerns are allayed, and lawmakers agree to it. The Penn Medicine group is in the middle of a large-scale test across institutions.

Researchers believe federated learning, also known as collaborative learning, will be the next wave of AI. (Google reportedly implemented one of the first use cases of federated learning to improve predictive keyboards.)

Federated learning could create more opportunities to use AI in healthcare, according to Rivka Colen, co-author of the Penn Medicine study and an associate professor of radiology at the University of Pittsburgh School of Medicine. "I think it's a huge game changer," Colen said in the press release. "AI will revolutionize this field, because, right now, as a radiologist, most of what we do is descriptive. With deep learning, we're able to extract information that is hidden in this layer of digitized images."

The ides of sharing a common model, rather than individual data, could lend itself to other applications, such as IoT. Cornell University, for example, proposed a federated learning IoT framework for a cloud-edge architecture in a paper it published recently.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2020 IDG Communications, Inc.

IT Salary Survey: The results are in