本文已被:浏览 658次 下载 1543次
Received:November 24, 2021 Revised:December 20, 2021
Received:November 24, 2021 Revised:December 20, 2021
中文摘要: 多维数据的发布与分析可以产生巨大的价值, 但在数据收集阶段时常发生隐私泄露的问题. 传统的中心化差分隐私保护方法要求一个完全可信的第三方数据收集者来收集数据, 但在现实中很难找到一个完全可信的第三方数据收集者. 随着属性维度的增加, 数据收集者的求精处理工作(联合分布的计算)也成了一个亟待解决的问题. 针对上述问题提出一种适用于多值数据的本地化差分隐私保护算法(RR-LDP), 引入一元编码和瞬时随机响应技术用来在数据收集阶段保护个人隐私, 降低了通信开销; 在满足LDP的情况下, 结合期望最大化(EM)算法和LASSO回归模型, 提出了高效的多维数据联合分布估计算法(LREMH). 该算法用LASSO回归模型估计初始值, 用EM算法进行迭代计算. 理论分析和实验结果表明LREMH算法在精度和效率之间取得了平衡.
Abstract:The release and analysis of multidimensional data can produce great value. However, privacy disclosure often occurs in the data collection phase. The traditional centralized differential privacy protection method requires a completely trusted third-party data collector, which is quite difficult to be found in practice. With the increase in attribute dimensions, the refinement of data collectors (the calculation of joint distribution) has also become an urgent problem to be solved. To address the above problems, this study proposes a localized differential privacy protection algorithm (RR-LDP) for multi-valued data. Unary coding and instantaneous random response technique are introduced to protect personal privacy in the data collection phase, which reduce communication overhead. With the combination of expectation maximization (EM) algorithm and LASSO regression model, the study puts forward an efficient joint distribution estimation algorithm (LREMH) for multidimensional data, which meets the requirement of LDP. The algorithm uses the LASSO regression model to estimate the initial value and employs the EM algorithm for iterative calculation. Theoretical analysis and experimental results show that the LREMH algorithm achieves a balance between accuracy and efficiency.
keywords: multidimensional data localized differential privacy expectation maximization (EM) algorithm LASSO regression joint distribution estimation privacy protection random response
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(62062020, 62002081)
引用文本:
褚雪君,龙士工,刘海.满足LDP的多维数据联合分布估计.计算机系统应用,2022,31(8):230-238
CHU Xue-Jun,LONG Shi-Gong,LIU Hai.Joint Distribution Estimation for Multidimensional Data Based on LDP.COMPUTER SYSTEMS APPLICATIONS,2022,31(8):230-238
褚雪君,龙士工,刘海.满足LDP的多维数据联合分布估计.计算机系统应用,2022,31(8):230-238
CHU Xue-Jun,LONG Shi-Gong,LIU Hai.Joint Distribution Estimation for Multidimensional Data Based on LDP.COMPUTER SYSTEMS APPLICATIONS,2022,31(8):230-238