期刊文献+

基于SMOTEBoost的非均衡数据集SVM分类器 被引量:14

A SVM Classifier for Imbalanced Datasets Based on SMOTEBoost
在线阅读 下载PDF
导出
摘要 在对实际问题进行数据挖掘时面临的多数是非均衡数据集,即各种类型的数据分布并不均匀,且关注的类型常是少数类。运用含有少量少数类型事例的数据集训练后的模型进行预测时,通常对多数类的预测精度很高,而少数类的预测精确性却很差。提出了一种集成方法SMOTEBoostSVM,通过SMOTE技术人工生成增加少数类样本量,以具有较强分类性能和泛化性能的SVM作为弱分类器,并以AdaBoost算法构建集成分类器。实验结果表明,SMOTEBoostSVM集成分类器比单纯运用SMOTE技术、AdaBoost算法以及SVM等的分类器,在非均衡数据集的分类预测中具有更好的效果。 Many real world data mining applications involve imbalanced data sets, where all kinds of data are unevently distributed and the particular events of interest may be very few when compared to the other classes. Data sets that contain rare events usually produces biased classifiers that have a higher predictive accuracy over the majority classes, but poorer predictive accuracy over the minority class of interest. This paper presents a novel ensemble algorithm, SMOTEBoostSVM, which balances the classes distribution with SMOTE, and combines AdaBoost algorithm with SMOTE, using SVM as weaker. Experiments on imbalanced datasets showed that the SMOTEBoostSVM algorithm performed better in classifying prediction of imblanced data sets SMOTE, AdaBoost or SVM used alone.
出处 《系统工程》 CSCD 北大核心 2008年第5期116-119,共4页 Systems Engineering
关键词 SMOTE ADABOOST 支持向量机 非平衡数据集 SMOTE AdaBoost SVM Imbalanced Datasets
  • 相关文献

参考文献9

  • 1Kubat M, Holte R C, Stan M. Machine learning for the detection o:f oil spills in satellite radar images[J]. Machine Learning, 1998,30 (2) : 195- 215.
  • 2Randall W D, Martinez T R. Reduction techniques for instance-based learning algorithms[J]. Machine Learning, 2000,38 (3) : 257- 286.
  • 3Guo H Y, Viktor H L. Learning from imbalanced data sets with boosting and data generation: the data boost-IM approach[J]. SIGKDD Explorations, 2004, 6(1):30-39.
  • 4Daskalaki Sophia, et al. Evaluation of classifiers for an uneven class distribution problem [J].Applied Artificial Intelligence, 2006,20 (5) : 381- 417.
  • 5Yoav F, et al. A short introduction to boosting[J]. Journal of Japanese Society for Artificial intelligence, 1999,14(5) : 771- 780.
  • 6Chawla N, et al. SMOTE: synthetic minority oversampling Technique[J]. Journal of Artificial Intelligence Research, 2002,16 : 321- 357.
  • 7Wu G, Chang E Y. Class-boundary alignment for imbalanced dataset learning[A]. ICML 2003 Workshop on Learning from Imbalanced Data Sets Ⅱ[C]. Washington, D. C. , 2003.
  • 8Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods [M]. Cambridge, UK : Cambridge University Press, 2000.
  • 9Giorgio V, Dietterich T G. Bias-variance analysis of support vector machines for the development of svmbased ensemble methods [J]. Journal of Machine Learning Research, 2004,5 : 725- 775.

同被引文献120

引证文献14

二级引证文献98

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部