摘要
随着互联网的发展,数据变得越来越复杂,数据维度也向多维化发展,实际应用中,高维数据愈加普遍,在高维数据中发现数据之间的关系也越来越重要。论文介绍了处理高维数据的应用背景及研究意义,研究分析了当前处理高维数据的聚类分析和常见降维方法的研究现状以及存在的问题,并总结了主成分分析、线性判别式以及局部线性嵌入三种主要降维算法的思想、相应算法的发展现状、存在的缺点,最后指出在高维数据聚类分析方面还需要注意的几个问题及今后的发展方向。
With the development of the Internet,data becomes more and more complex,and the data dimension is more multi-dimensional.In practice,high-dimensional data is more common,and it is more and more important to find the relationship of data in high-dimensional data.In this paper,the application background and research significance of processing high dimensional data are introduced;the current research status and problems of clustering analysis and common dimensional reduction method are analyzed;the idea,development of the corresponding algorithm and shortcomings of three main dimensional reduction algorithms including the principal component analysis,the linear discriminant and the local linear embedding are summarized.Finally,several problems and the future development direction in the high dimensional data cluster analysis are presented.
作者
孙洁丽
刘沛
翟浩文
SUN Jie-li;LIU Pei;ZHAI Hao-wen(College of Information Technology,Hebei University of Economics and Business,Shijiazhuang Hebei 050061,China)
出处
《河北省科学院学报》
CAS
2022年第5期1-6,共6页
Journal of The Hebei Academy of Sciences
基金
河北省科技厅重点研发计划项目(21327401D)
河北省教育厅项目(2020GJJG140)
河北经贸大学科研项目(2020PY10)。
关键词
高维数据
聚类分析
数据降维
PCA
LDA
LLE
High-dimensional data
Clustering analysis
Data dimension reduction
PCA
LDA
LLE