[4Xin1-09] Comparison of Communication Cost and Accuracy in Federated Learning and Data Collaboration Analysis
Assuming Integrated Data Analysis of University of Tsukuba Hospital and Tsukuba City Hall
Keywords:Data collaboration analysis, Federated Learning, Distributed data, Privacy preserving, Medical data
Federated Learning (FL) has been studied assuming a constant Internet connection, but often sensitive distributed data cannot always be connected. Even in this situation, FL can perform integrated data analysis by using different communication methods. Data Collaboration (DC) analysis has been proposed for integrated data analysis without a permanent Internet connection or sharing of private data. This study constructs a machine learning model to predict fasting blood glucose using medical data from the University of Tsukuba Hospital and Tsukuba City Hall, where raw data cannot be shared, and a constant Internet connection is not available. Accuracy and computational cost of following three methods are compared: individual analysis at each institution, FL without a constant Internet connection, and DC. Results show that FL and DC generally improve prediction accuracy compared to individual analysis. DC is twice as computationally intensive as FL but achieves the same accuracy in 1/3 of the time.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.