摘要
针对现有K-Means聚类安全外包方案计算和通信开销高,难以满足实际应用对高效率需求的问题,提出一种基于稀疏矩阵变换和有界随机扰动的隐私保护K-Means聚类外包方案。首先,利用Gram-Schmidt正交化构造稀疏密钥矩阵,实现对明文数据的高效正交变换,有效隐藏明文数据的数值特征;其次,引入服从高斯分布的有界随机扰动,保护明文数据点之间的距离信息,增强用户数据的安全性;最后,结合局部敏感哈希设计近似距离估计方法,在保证聚类准确的前提下降低外包方案的计算开销。理论分析表明,所提方案实现了正确性、安全性和高效性的设计目标。在多个真实数据集上的实验结果表明,相较于现有基于同态加密的K-Means聚类外包方案,所提方案在保持聚类准确的同时,显著降低了计算与通信开销。
To address the problem that existing secure outsourcing schemes for K-Means clustering incur high computational and communication overhead,making them difficult to satisfy the efficiency requirements of practical applications,a privacy-preserving K-Means clustering outsourcing scheme based on sparse matrix transformation and bounded random perturbation was proposed.Firstly,a sparse key matrix was constructed by using Gram-Schmidt orthogonalization to perform efficient orthogonal transformations on plaintext data,effectively hiding the numerical characteristics of the plaintext data.Secondly,bounded random perturbations following a Gaussian distribution were introduced to protect the distance information between plaintext data points,enhancing the security of user data.Finally,an approximate distance estimation method was designed by combining locality sensitive hashing to reduce the computational overhead of the outsourcing scheme under the premise of ensuring clustering accuracy.Theoretical analysis demonstrates that the proposed scheme achieves the design goals of correctness,security and efficiency.Experimental results on multiple real-world datasets show that compared to existing K-Means clustering outsourcing schemes based on homomorphic encryption,the proposed scheme significantly reduces computational and communication overhead while maintaining clustering accuracy.
作者
赵韦
谭静文
王焕然
韩帅
杨武
赖明珠
Zhao Wei;Tan Jingwen;Wang Huanran;Han Shuai;Yang Wu;Lai Mingzhu(School of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China;School of Mathematics and Statistics,Hainan Normal University,Haikou 571158,China)
出处
《通信学报》
北大核心
2026年第1期74-90,共17页
Journal on Communications
基金
国家自然科学基金资助项目(No.U22A2036,No.U21B2019,No.62272127,No.62572144)
黑龙江省自然科学基金资助项目(No.TD2022F001,No.LH2024F036)
海南省自然科学基金高层次人才基金资助项目(No.622RC672)。
关键词
K-MEANS聚类
矩阵变换
随机扰动
局部敏感哈希
外包计算
隐私保护
K-Means clustering
matrix transformation
random perturbation
locality sensitive hashing
outsourcing computation
privacy-preserving