Federated Learning for personalized healthcare prediction models in oncology
Challenge: Between 2009 and 2013, our Head of Data Science led the AI development and implementation of the euroCat project, a groundbreaking solution for personalized treatment methodology in radiation oncology. The project involved five hospitals across three EU countries, utilizing the data of their cancer patients with the aim to optimize personalized treatments. The goal was to develop a Machine Learning model that is capable of learning from disparate data sources without centralizing patient data, in a way that preserves patient privacy. You can read more about euroCat in the academic paper.
Solution: The solution leveraged Federated Learning (which has become the more common name for distributed learning) - a technique that enables multiple parties to collaborate on a Machine Learning model without sharing sensitive data and without the need for data to be centralized, via exchanging only model parameters during the learning process.
In this approach, each hospital trains part of the model at each iteration using its local data and then shares only some parameters with a central server. The central server then aggregates the parameters from all countries to update the global model without accessing the local data. The learning continues through iteration cycles until model convergence criteria are met. The role of a “central server” can be played by any of the hospitals on the grid. Georgi Nalbantov’s AI team used Federated Learning to learn Support Vector Machine (SVM) models, using the Alternating Direction Method of Multipliers (ADMM), from disparate databases to predict treatment outcomes: which can be either a direct treatment effect or a treatment side effect, for example, shortness of breath after (lung) radiotherapy.
The performance of the SVM models was evaluated by the Area Under the Curve (AUC) in a five-fold cross-validation procedure (training at four sites and validation at the fifth). The performance of the pooled (federated) learning algorithm was compared with centralized learning, in which the datasets of all clinics are combined in a single dataset. The result of the centralized model was (naturally) the same as the federated-learning model, as they are mathematically yielding the same result.
The euroCat project using Federated Learning had a significant impact on the development of personalized medicine in radiation oncology. It demonstrated that it is possible to collaborate on a machine-learning model without sharing sensitive data, while still obtaining results that are the same as those achieved through centralized learning. Moreover, the project's approach enabled better and more personalized treatments for cancer patients, as the machine-learning models were trained on a combination of datasets that represented the variations across the entire population of cancer patients from these (five) hospitals.
The solution is used in several clinics across Europe. However, the methodology is applicable in other sectors where sharing data between independent parties is not an option like healthcare, banking and insurance.
Discover More: Download Detailed Information