摘要
针对水电站监控画面数据量大、传输链路受限、内部环境复杂和传输效率低下等问题,提出了一种基于MapReduce差分隐私的K-means聚类算法。首先在红外系统所提取的数据中引入Hadoop下的MapReduce并行框架,以便将数据集转换生成键值对,从而提高容错率和传输效率。其次在迭代聚簇中加入Laplace噪声计算聚类中心,以实现对数据的隐私保护。然后在传输过程中根据传输链路宽带情况选择无损压或有损压分段式传输,以保障数据传输的实时性和完整性。最后,我们在吉林台流域梯级水电站红外监控系统中进行算法测试和性能评估,基于MapReduce差分隐私的聚类汇聚传输算法相较于传统K-means聚类算法在不同隐私预算下进行统计量指标对比。结果表明,改进后的聚类算法聚类可用性可达到94.5%,聚类程度更好,数据汇聚传输的效果更佳。
Aiming at the problems from large amount of data,limited transmission links,complex internal environment and low transmission efficiency of hydropower station monitoring images,a K-means clustering algorithm based on the MapReduce differential privacy is proposed.Firstly,the MapReduce parallel framework under Hadoop is introduced into the data extracted by the infrared system,to convert the data set into key-value pairs,in order to improve the fault tolerance rate and transmission efficiency.Secondly,Laplace noise is added to the iterative clustering for calculating the clustering center,so as to realize the privacy protection of data.Then,according to the broadband situation of the transmission link,the lossless voltage or lossy voltage segmented transmission is selected to ensure the real-time and integrity of data transmission.Finally,we test the algorithm and evaluate its performance in the infrared monitoring system of cascade hydropower stations in Jilin Tai Basin.The clustering aggregation transmission algorithm based on MapReduce differential privacy is compared with the traditional K-means clustering algorithm under different privacy budgets.The results show that the clustering availability by the improved clustering algorithm can reach 94.5%,that the degree of clustering is better,and that the effect of data aggregation transmission is better.
作者
张科峰
马文华
郑庆明
李云红
李丽敏
苏雪平
冯准若
ZHANG Kefeng;MA Wenhua;ZHENG Qingming;LI Yunhong;LI Limin;SU Xueping;FENG Zhunruo(CHN ENERGY Xinjiang Jilintai Hydropower Development Co.,Ltd.,Ili 835000,China;School of Electronics and Information,Xi’an Polytechnic University,Xi’an 710048,China)
出处
《西安理工大学学报》
北大核心
2025年第3期381-389,共9页
Journal of Xi'an University of Technology
基金
国家自然科学基金资助项目(62203344)
陕西省自然科学基础研究重点资助项目(2022JZ-35)
陕西高校青年创新团队项目。
关键词
水电站
红外监控系统
差分隐私
数据传输
聚类算法
hydropower station
infrared monitoring system
differential privacy
data transmission
clustering algorithm