本文已被:浏览 762次 下载 1504次
Received:April 16, 2022 Revised:May 22, 2022
Received:April 16, 2022 Revised:May 22, 2022
中文摘要: 在考虑用户隐私的保护多源域数据背景下预测疾病得分的问题中, 来自不同源域的数据分散存储无法合并, 且可能服从不同的分布, 因此传统的机器学习方法无法合理地利用源域数据的信息. 本文结合联邦学习的思想和基于样本的迁移学习方法, 提出了联邦重要性加权方法, 通过将源域的样本重用于目标域的预测任务, 而且不需要进行源域之间的数据共享, 实现了在保护源域的数据隐私的情况下利用分布不同的多源域的信息提升目标域预测的精度. 并且基于提出的方法, 本文构造了一种加权模型并提供了一个简洁通用的算法用于求解目标域的预测模型. 数值模拟以及实证结果表明, 相对于未考虑分布迁移的传统方法, 联邦重要性加权方法可以有效地利用多源域数据的信息, 在目标域的预测精度上具有优势, 以及在帕金森疾病数据中做出精准的疾病得分预测.
Abstract:In the problem of predicting disease scores amid the protection of multi-source domain data considering user privacy, the decentralized data from different source domains cannot be combined and may follow different distributions. Therefore, traditional machine learning methods cannot be applied directly to utilize the information within source domains. In this study, the federated importance weighting method is proposed combining the idea of federated learning and the sample-based transfer learning approach. By re-weighting the samples from the source domains to the prediction task of the target domain, and without data sharing between multiple source domains, it realizes the use of data with different distributions while protecting the data privacy of the source domains. Moreover, this study constructs a weighted model and provides a concise and general algorithm to solve the prediction model for the target domain. Numerical simulation and empirical results show that, compared with the traditional method without considering distribution shift, the federated importance weighting method can effectively utilize the information of the source domain data. It is superior in prediction accuracy of the target domain and can make an accurate prediction of disease scores in the Parkinson’s disease data.
keywords: federated learning transfer learning importance weighting weighted model disease scores machine learning privacy protection
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(71873128, 12171451)
引用文本:
许亚倩,崔文泉,程浩洋.基于联邦学习和重要性加权的疾病得分预测.计算机系统应用,2022,31(12):375-382
XU Ya-Qian,CUI Wen-Quan,CHENG Hao-Yang.Disease Scores Predicting Based on Federated Learning and Importance Weighting.COMPUTER SYSTEMS APPLICATIONS,2022,31(12):375-382
许亚倩,崔文泉,程浩洋.基于联邦学习和重要性加权的疾病得分预测.计算机系统应用,2022,31(12):375-382
XU Ya-Qian,CUI Wen-Quan,CHENG Hao-Yang.Disease Scores Predicting Based on Federated Learning and Importance Weighting.COMPUTER SYSTEMS APPLICATIONS,2022,31(12):375-382