期刊文献+

基于Mean Decrease Impurity改进XGBoost算法 被引量:2

Improved XGBoost algorithm based on mean decrease impurity
在线阅读 下载PDF
导出
摘要 针对XGBoost算法在处理高维数据集分类准确率和效率下降的情况,文中提出一种基于mean decrease impurity算法改进的XGBoost算法;并设计频数算法解决mean decrease impurity算法对特征重要度排名的随机性。实验结果表明,本方法预测效率以及准确率要优于未改进XGBoost算法,同时mean decrease impurity算法也优于基于方差算法。因此,文中所提出的分类方法具有更高的准确率和效率。 Aiming at the accuracy and efficiency of XGBoost algorithm in processing high-dimensional datasets,an improved XGBoost algorithm based on mean decrease impurity algorithm is proposed,and a frequency algorithm is designed to solve the randomness of the mean decrease impurity algorithm.The experimental results show that the prediction efficiency and accuracy of this method are better than the unmodified XGBoost algorithm,and the mean decrease impurity algorithm is better than the variance-based feature selection method.Therefore,the algorithm proposed in this paper has higher accuracy and efficiency.
作者 杜俊杰 朱永忠 丁根宏 DU Jun-jie;ZHU Yong-zhong;DING Gen-hong(School of Science,Hohai University,Nanjing 211100,China)
机构地区 河海大学理学院
出处 《信息技术》 2019年第9期1-4,共4页 Information Technology
基金 中央高校基本科研业务费专项资金资助(JGLX19_030,2019B80014)
关键词 XGBoost 高维数据 mean DECREASE IMPURITY 集成学习 XGBoost high-dimensional datasets mean decrease impurity ensemble learning
  • 相关文献

参考文献3

二级参考文献16

  • 1林舒杨,李翠华,江弋,林琛,邹权.不平衡数据的降采样方法研究[J].计算机研究与发展,2011,48(S3):47-53. 被引量:33
  • 2郑义,姚建铨,吴峰,房晓俊,施祥春.用修正的RELIEF方法测量高速空气流瞬时速度的理论研究[J].光学学报,1996,16(8):1148-1151. 被引量:4
  • 3Davies S, Russl S. NP completeness of searches for smallest possible feature sets[C]//Proceedings of the AAAI Fall Symposiums on Relevance, Menlo Park, 1994:37-39.
  • 4Breiman L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32.
  • 5Strobl Carolin, Boulesteix Anne-Laure, Kneib Thomas, et al. Conditional variable importance for random forests[J]. BMC Bioinformatics, 2008, 9 (1) : 1-11.
  • 6Reif David M, Motsinger Alison A, McKinney Brett A, et al. Feature selection using a random forests classifier for the integrated analysis of multiple data types[C]//IEEE Symposium on Computational In- telligence and Bioinformatics and Computational Bi- ology, 2006: 171-178.
  • 7Mohammed Khalilia, Sounak Chakraborty, Mihail Popescu. Predicting disease risks from highly im- balanced data using random forese[J]. BMC Medi- cal Informaties and Decision Making, 2011, 11(7): 51-58.
  • 8Verikas A, Gelzinis A, Bacauskiene M. Mining data with random forests: a survey and results of new tests[J]. Pattern Recognition, 2011, 44 (2): 330-349.
  • 9Inza I, Larranaga P, Blanco R. Filter versus wrap- per gene selection approaches in DNA microarray domains [J]. Artificial Intelligence in Medicine, 2004, 31(2): 91-103.
  • 10蒋盛益,郑琪,张倩生.基于聚类的特征选择方法[J].电子学报,2008,36(B12):157-160. 被引量:18

共引文献401

同被引文献32

引证文献2

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部