期刊文献+

利用PCA和AdaBoost建立基于贝叶斯的组合分类器 被引量:6

Construct Ensembles of Bayes-based Classifiers Using PCA and AdaBoost
在线阅读 下载PDF
导出
摘要 提出了一种使用基于贝叶斯的基分类器建立组合分类器的新方法PCABoost。本方法在创建训练样本时,随机地将特征集划分成K个子集,使用PCA得到每个子集的主成分,形成新的特征空间,并将全部的训练数据映射到新的特征空间作为新的训练集。通过不同的变换生成不同的特征空间,从而产生若干个有差异的训练集。在每一个新的训练集上利用AdaBoost建立一组基于贝叶斯的逐渐提升的分类器(即一个分类器组),这样就建立了若干个有差异的分类器组,然后在每个分类器组内部通过加权投票产生一个预测,再把每个组的预测通过投票来产生组合分类器的分类结果,最终建立一个具有两层组合的组合分类器。从UCI标准数据集中随机选取30个数据集进行实验。结果表明,本算法不仅能够显著提高基于贝叶斯的分类器的分类性能,而且与Rotation Forest和AdaBoost等组合方法相比,在大部分数据集上都具有更高的分类准确率。 We presented a novel method for constructing ensembles of Bayes-based classifiers called PCABoost. For creating a training data, our method splited the features set into K-subsets randomly, and applied principal component analysis to each of the feature subsets to get its corresponding principal components. And then all of principal components were put together to form a new feature space into which the total original dataset were mapped to create a new training set. Different process could generated different feature space and different training sets. On each of the new training data we generated a group of classifiers which were boosted one by one using AdaBoost, so we could generate several different classifiers groups in the several different feature spaces. In the classification phase we firstly got several predicts using weighted-voted inside each of the classifiers groups, and then voted on the several predicts to get the final result as the ensemble's predict. Experiments were carried on 30 benchmark datasets picked up randomly from the UCI Machine Learning Repository, the results indicate that our method not only improves the performance of Bayes-based classifiers significantly, but also get higher accuracy on most of data sets than other ensemble methods such as Rotation Forest and AdaBoost.
作者 陈松峰 范明
出处 《计算机科学》 CSCD 北大核心 2010年第8期236-239,256,共5页 Computer Science
基金 国家自然科学基金(编号:60773048) 国家"十一五"科技支撑计划课题(编号:2006BAF01A00)资助
关键词 组合分类器 主成分分析 ADABOOST 贝叶斯 Classifier ensemble, Principal component analysis, AdaBoost, Bayes
  • 相关文献

参考文献16

  • 1Han J,Kamber M.Data Mining:Concepts and Techniques(2nd ed)[M].Morgan Kaufmann,2006.
  • 2Tan P,Steinbach M,Kumar V.Introduction to Data Mining[M].Addison-Wesley,2006.
  • 3Haindl M,Kittler J,Roli F.Multiple Classifier Systems[C] ∥Proc.of the 7th International Workshop on MCS.Springer,2007.
  • 4Oza N C,et al.Multiple Classifier Systems[C] ∥Proc.of the 6th International Workshop on MCS.Springer 2005.
  • 5Roli F,et al.Multiple Classifier Systems[C] ∥Proc.of the 5th International Workshop on MCS.Springer,2004.
  • 6Roli F,Kittler J.Multiple Classifier Systems[C] ∥Proc.of the First International Workshop on MCS.Springer 2001.
  • 7Kuncheva L I.Combining Pattern Classifiers.Methods and Algorithms[M].John Wiley and Sons,2004.
  • 8Breiman L.Bagging Predictors[J].Machine Learning,1996,24(2):123-140.
  • 9Schapire R E.The Strength of Weak Learnability[J].Machine Learning,1990,5(2):197-227.
  • 10Freund Y,Schapire R E.Experiments with a new boosting algorithm[C] ∥Proceedings of 13th International Conference on Machine Learning.Bari,Italy:Morgan Kaufmann,1996:148-156.

同被引文献48

引证文献6

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部