期刊文献+

雷达图图形特征提取中的特征排序 被引量:4

Feature ordering in feature extraction of star plot
在线阅读 下载PDF
导出
摘要 基于多元数据的雷达图图表示,提出了雷达图重心图形特征。针对同样的多元数据不同的特征排序会导致不同的雷达图图表示,进而产生不同的重心特征,而这些重心特征会最终影响分类器的性能,因此提出一种新的问题,即雷达图图形特征提取中的特征排序问题。基于这个新的问题,设计了一种新的解决方法,即提出了基于改进的遗传算法的特征排序。同时也研究并改进了传统的基于排序的特征选择方法。基于一些机器学习数据库的分类实验结果表明:一方面,数据的原始特征排序下的重心特征和传统的特征提取方法相比,并不总是最优,但是在遗传算法下特征排序的重心特征优于传统的特征提取方法;另一方面,在遗传算法下特征排序的重心特征优于传统的基于排序的特征选择方法下的重心特征。尤其对于高维小样本的肺癌数据达到了12.5%的留一法交叉验证错误率,效果非常好。乳腺癌数据和糖尿病数据等的分类结果优于目前国际上的报道。 The barycentre graphical feature extraction method of the star plot is proposed based on the graphical representation of multi-dimensional data. For the different feature ordering the same multi-dimensional data lead to the different star plots and extract the different barycentre graphical features, which affect the classification error of the classifiers, the novel question is proposed, that is feature ordering question in the feature extraction of star plots of multivariate data. For the question, the novel feature ordering method based on the improved genetic algorithm is proposed. Meanwhile the traditional feature ordering method based on the feature selection is researched and the traditional feature extraction method is researched. The experiments results of the 9 real data sets of machine learning show as follow. One hand, the classification errors ofbarycentre feature of star plot of the original feature ordering of multivariate data are better or worse than the traditional feature extraction method. But the classification errors of barycentre feature of star plot of the feature ordering based on the improved genetic algorithm of multivariate data are better than the traditional feature extraction method. On the other hand, the classification errors ofbarycentre feature of star plot of the feature ordering based on the improved genetic algorithm of multivariate data are better than the traditional feature ordering method based on the feature selection. Especially for the lung cancer data set with high dimension and small sample, the classification errors of leave one out cross validation is 12.5%. The classification errors of breast-cancer-Wisconsin data set and Pima Indians diabetes data set is superior to the report of international paper.
出处 《燕山大学学报》 CAS 2008年第5期421-428,共8页 Journal of Yanshan University
基金 国家自然科学基金资助项目(60474065 60504035 60605006)
关键词 特征提取 特征排序 特征选择 遗传算法 雷达图 模式识别 feature extraction feature ordering feature selection genetic algorithm star plot pattern recognition
  • 相关文献

参考文献5

  • 1Wang Jinjia, Hong Wenxue, Li Xin. The new graphical features of star plot for K nearest neighbor classifier [M] //Advanced Intelligent Computing Theories and Applications With Aspects of Artificial Intelligence. Berlin/Heidelberg: Springer, 2007: 926-933.
  • 2Jain A K, Duin R, Mao Jianchang. Statistical pattern recognition: a review [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22 (I), 4-37.
  • 3Harol A, Lai C, Pekalska E, et al.. Pairwise feature evaluation for constructing reduced representations [J]. Pattern Analysis & Applications, 2007,10 (1): 55-68.
  • 4Ahmad A, Dey L. A feature selection technique for classificatory analysis [J]. Pattern Recognition Letter, 2005,26 (1): 43-56.
  • 5Huang Cheng-Lung,Wang Chieh-Jen.A GA-based feature selection and parameters optimization for support vector machines [J].Expert Systems with Applications, 2006,31 (5): 231-240.

同被引文献33

引证文献4

二级引证文献36

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部