期刊文献+

剪枝与欠采样相结合的不平衡数据分类方法 被引量:4

Pruning and undersampling combination of imbalanced data classification method
在线阅读 下载PDF
导出
摘要 通过剪枝技术与欠采样技术相结合来选择合适数据,以提高少数类分类精度,研究欠采样技术在不平衡数据集环境下的影响。结果表明,与直接欠采样算法相比,本文算法不仅在accuracy值上有所提高,更重要的是大大改善了g-means值,特别是对非平衡率较大的数据集效果会更好。 This paper proposed pruning and under-sampling combined approaches for selected the representative data as training data to improve the classification accuracy for minority class and investigated the effect of under-sampling methods in the imbalanced class distribution environment. The experimental results show that the accuracy of algorithm of this paper compare with direct undersampling algorithm have increased, the most important is to significantly improve the g-means' value. Especially, the effect will be better on the imbalance rate of larger data sets.
作者 张健 方宏彬
出处 《计算机应用研究》 CSCD 北大核心 2012年第3期847-848,共2页 Application Research of Computers
基金 国家自然科学基金资助项目(71071002) 安徽省教育厅自然科学基金资助项目(05010428) 安徽大学人才队伍建设项目 安徽大学学术创新团队项目(KJTD001B)
关键词 机器学习 不平衡数据集 剪枝技术 欠采样技术 交叉验证 合并分类器增强算法 machine learning imbalanced data sets pruning techniques under-sampling cross-validation AdaBoost algorithm
  • 相关文献

参考文献11

  • 1WEISS G M. Mining with rarity:a unifying framework[ J]. SIGKDD Explorations ,2004,6( 1 ) :7-19.
  • 2KUBAT M, MATWIN S. Addressing the curse of imbalanced training sets : one sided selection [ C ]//Proc of the 14th International Confe- rence on Machine Learning. 1997:179-186.
  • 3YEN S J, LEE Y S. Cluster-based under-sampling approaches for im- balanced data distrlbutions[J]. Expert Systems with Applications, 2009,36(3 ) :5718-5727.
  • 4郭虎升,亓慧,王文剑.处理非平衡数据的粒度SVM学习算法[J].计算机工程,2010,36(2):181-183. 被引量:15
  • 5JAPKOWICZ N. The class imbalance problem: significance and stra- tegies [ C ]//Proc of International Conference on Artificial Intelli- gence. 2000.
  • 6JAPKOWICZ N. Concept-learning in the presence of between-class and within class imbalances[ C]//Proc of the 14th Conference of the Canadi- an Society for Computational Studies of Intelligence. 2001:67-77.
  • 7CHAWLA N V ,BOWYER K W, HALL L O,et al. SMOTE:synthetic minority over-sampling technique [ J]. ,Journal of Artificial Intelli- gence Research,2002,16 ( 1 ) :321 - 357.
  • 8CHAWLA N V t LAZAREVIC A, HALL.O. SMOTEBoost : improving prediction'of the minority class in boosting.[ C ]//Proc of the 7th Euro- pean Conference on Principles and Practice of Knowledge Discovery in Databases. Berlin : Springer,2003 : 107-119.
  • 9张钹,张铃.问题求解理论及应用--商空间粒度计算理论及应用[M].2版.北京:清华大学出版社,2007.
  • 10文贵华,向君,丁月华.基于商空间粒度理论的大规模SVM分类算法[J].计算机应用研究,2008,25(8):2299-2301. 被引量:8

二级参考文献15

  • 1高平安,蒙祖强,蔡自兴.基于粒度计算的数据分类建模研究[J].计算机应用研究,2007,24(3):37-40. 被引量:2
  • 2Vapnik V. Statictical Learning Theory[M]. New York, USA: Wiley, 1998.
  • 3Tang Yuchun. Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning[D]. Atlanta, USA: Georgia Stage University, 2006.
  • 4Yao Y Y. On Modeling Data Mining with Granular Computing[C]// Proc. of the 25th Annual International Conference on Computer Software and Applications. Chicago, USA: [s. n.], 2001.
  • 5Kubat M, Matwin S. Addressing the Curse of Imbalanced Training Sets: One-sided Selection[C]//Proc. of the 14th International Conference on Machine Learning. Nashville, Tennessee, USA: [s. n.], 1997.
  • 6YAO Y Y,ZHONG Ning.Potential applications of granular computing in knowledge discovery and data mining[C]//Proc of World Multi-Conference on Systemics,Cybernetics and Informatics.[S.l.]:Computer Science and Engineering,1999:573-580.
  • 7VAPNIK V N.The nature of statistical learning theory[M].New York:Springer-Verlag,1995.
  • 8VAPNIK V,GOLOWICH S,SMOLA A.Support vector method for function approximate,regression estimation,and signal processing[M].Cambridge:MIT Press,1997:281-287.
  • 9OSUNA E,FREUND R,GIRROSI F.Improved training algorithm for support vector machines[C]//PRINCIPLE J,GILES L,MONGAN N,et al.Proc of IEEE Workshop on Neural Networks and Signal Processing.Amelia Island:IEEE Press,1997:276-285.
  • 10PLATT J C.Fast training of support vector machines using sequential minimal optimization[M]// SCHOLKOPF B,BURGES C J C,SMOLA A J.Advances in kernel methods:support vector learning.Cambridge:MIT Press,1999:185-208.

共引文献21

同被引文献30

引证文献4

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部