摘要
传统的混合属性大数据集分析系统分析结果准确性差,针对这一问题,基于聚类目标函数设计了一种新的分析系统,对系统硬件和软件进行了优化。分析系统硬件由网络爬虫模块、数据处理模块、大数据集分析模块、用户管理模块四部分组成,通过单机聚类和分布式聚类对大数据集进行聚类处理。在给出的硬件平台上设计了网络爬虫程序、数据处理程序和数据分析程序。为检测分析系统性能,与传统分析系统进行实验对比,结果表明,利用聚类目标函数的设计分析系统在分析混合属性大数据集这一问题上,准确性高,安全可靠,具有很高的应用价值。
The traditional hybrid attribute big data set analysis system has poor accuracy. To solve this problem,a new analysis system is designed based on the clustering objective function,and the system hardware and software are optimized. The analysis system hardware consists of four parts: web crawler module,data processing module,big data set analysis module and user management module. The large data set is clustered by single cluster and distributed clustering. Web crawlers,data handlers,and data analysis programs were designed on the given hardware platform. In order to detect and analyze the performance of the system,the experimental comparison with the traditional analysis system shows that the design analysis system using the clustering objective function has high accuracy,safety and reliability in analyzing the big data set,and has high application value.
作者
赵云强
韩翼
崔慧茹
郑琳
ZHAO Yun qiang;HAN Yi;CUI Hui ru;ZHENG Lin(State Grid East Mongolia Meterial Supply Company,Hohhot 010020,China)
出处
《电子设计工程》
2020年第4期73-76,81,共5页
Electronic Design Engineering
基金
国家电网公司科技项目(PD72-17-006)。
关键词
聚类目标函数
混合属性
大数据集
分析系统
数据分析
clustering objective function
mixed attribute
big data set
analysis system
data analysis