摘要
为了防止链接攻击导致隐私的泄露,同时尽可能降低匿名保护时的信息损失,提出(λα,k)-分级匿名模型。该模型根据隐私保护的需求程度,将各敏感属性值划分为高、中、低三个等级类,通过隐私保护度参数λ灵活控制泄露风险。在此基础上,给出一种基于聚类的分级匿名方法。该方法采用一种新层次聚类算法,并针对准标识符中数值型属性与分类型属性采用灵活的概化策略。实验结果显示,该方法能够满足敏感属性的分级匿名保护需求,同时有效地减少信息损失。
To prevent the privacy disclosure caused by linking attack and reduce information loss resulting from anonymous protection, a (λα, k) multi-level anonymity model was proposed. According to the requirement of privacy preservation, sensitive attribute values could be divided into three levels: high, medium, and low. The risk of privacy disclosure was flexibly controlled by privacy protection degree parameter λ. On the basis of this, clustering-based approach for multi-level anonymization was proposed. The approach used a new hierarchical clustering algorithm and adopted more flexible strategies of data generalization for numerical attributes and classified attributes in a quasi-identifier. The experimental results show that the approach can meet the requirement of multi-level anonymous protection of sensitive attribute, and effectively reduce information loss.
出处
《计算机应用》
CSCD
北大核心
2013年第2期412-416,共5页
journal of Computer Applications
基金
国家自然科学基金资助项目(61262075)
广西高等学校重大科研项目(201201ZD012)
广西教育厅科研项目(200911LX119)
关键词
隐私保护
数据发布
数据匿名
分级
聚类
信息损失
privacy preservation
data publishing
data anonymization
multi-level
clustering
information loss