摘要
把主分量分析(PCA)方法和自组织特征映射网络(SOM)相结合,应用到基因数据聚类分析中。首先对基因数据集进行PCA分析,提取出少量的特征主分量,再对数据集进行降维。这些主分量基本上可以反映原数据集的综合信息,然后应用SOM网络对得到的特征分量进行聚类分析,把相似的基因划分到一个区域。实验结果表明,与单一地选用SOM网络进行聚类分析相比,该方法有较高的分类正确率及较为清晰的分类边界,是一种非常有效的聚类分析方法。
A new method combined PCA (Principal Component Analysis) with SOM (Self-Organizing Maps) neural network is presented for clustering analysis of gene expression data. Firstly, the principal components are extracted from the genetic data set by PCA,in order to get a low dimensional data set. These principal components with lower dimension can basically express comprehensive information of original data set. Secondly, the features from principal components are clustered by SOM,the similar gene data are grouped into same area. Compared with Self-Organizing Maps (SOM), the integrated PCA-SOM method can obtain a higher correct clustering rate and clear boundary. The experimental results show that the performance of new method for the clustering analysis of gene expression data is efficient and effective.
出处
《软件导刊》
2013年第1期127-130,共4页
Software Guide
基金
国家自然科学基金项目(40872087)
关键词
主分量分析
SOM网络
聚类分析
基因数据
Principal component analysis
Self-organizing maps
Clustering analysis
Gene data