摘要
在现有的基于数据扰动的隐私保持分类挖掘算法中,扰动数据和原始数据相关联,对隐私数据的保护并不完善,且扰动算法和分类算法耦合度高,不适合在实际中使用。为此,提出一种基于概率论的隐私保持分类挖掘算法。扰动后可得到一组与原始数据独立同分布的数据,使扰动数据和原始数据不再相互关联,各种分类算法也可直接应用于扰动后的数据。
In the existed privacy-preserving classification mining methods based on data perturbation, the privacy data is not protected perfectly because the perturbed data and the original data have been related. The classification algorithm and the data perturbation algorithm have high coupling It is not easy to use these methods in practice. To solve these problems, it proposes a privacy-preserving classification mining algorithm based on probability theory. The perturbed data is independent from the original data and they have the same distribution. This proposed method overcomes the shortcomings of others. The perturbed data is no relation with the original data and the classification methods can be used on the perturbed data directly.
出处
《计算机工程》
CAS
CSCD
2012年第3期12-13,18,共3页
Computer Engineering
基金
国家"863"计划基金资助项目(2007AA02Z329)
关键词
数据挖掘
隐私保持
数据扰动
随机噪声
决策树
data mining
privacy protection
data perturbation
random noise
decision tree