摘要
针对XGBoost算法在处理高维数据集分类准确率和效率下降的情况,文中提出一种基于mean decrease impurity算法改进的XGBoost算法;并设计频数算法解决mean decrease impurity算法对特征重要度排名的随机性。实验结果表明,本方法预测效率以及准确率要优于未改进XGBoost算法,同时mean decrease impurity算法也优于基于方差算法。因此,文中所提出的分类方法具有更高的准确率和效率。
Aiming at the accuracy and efficiency of XGBoost algorithm in processing high-dimensional datasets,an improved XGBoost algorithm based on mean decrease impurity algorithm is proposed,and a frequency algorithm is designed to solve the randomness of the mean decrease impurity algorithm.The experimental results show that the prediction efficiency and accuracy of this method are better than the unmodified XGBoost algorithm,and the mean decrease impurity algorithm is better than the variance-based feature selection method.Therefore,the algorithm proposed in this paper has higher accuracy and efficiency.
作者
杜俊杰
朱永忠
丁根宏
DU Jun-jie;ZHU Yong-zhong;DING Gen-hong(School of Science,Hohai University,Nanjing 211100,China)
出处
《信息技术》
2019年第9期1-4,共4页
Information Technology
基金
中央高校基本科研业务费专项资金资助(JGLX19_030,2019B80014)