摘要
以德国Rossmann商场的数据为例,通过对数据的探索性分析,以相关背景业务知识体系为基础,通过可视化分析,提取隐含在数据里的特征,使用性能较优的Xgboost方法进行规则挖掘,取得较好效果。为进一步提高Xgboost方法的预测精度和泛化性能,论文结合特征工程,采用集成学习方法,利用GLMNET和Xgboost模型拟合残差,结合LM、TSLM在趋势和季节性预测的优点,提出一种基于Xgboost的优化组合模型用以对行业数据进行预测,通过实验验证了该组合模型具有较好的精度和泛化能力。
Taking the data from Rossmann mall in Germany as an example,this paper extracts some characteristics inherent in the data which based on the exploratory data analysis and the knowledge of the related business background,and selects the optimal performance of Xgboost method for rule mining.In order to further improve the prediction accuracy and the generalization performance of Xgboost method,this paper proposes an optimal combination model of Xgboost by combining with feature engineering,adopting integrated learning method,and utilizing the GLMNET and Xgboost models to fit the residuals.Based on the advantages of LM and TSLM in trend and seasonal prediction,an optimal combination model of Xgboost is proposed to forecast the industry data.The experiment results show that the combination model has good precision and generalization ability.
出处
《南昌大学学报(理科版)》
CAS
北大核心
2017年第3期275-281,共7页
Journal of Nanchang University(Natural Science)
基金
国家自然科学基金资助项目(61262047)
江西省教育厅科技项目(GJJ14141)
江西省重点研发计划基金资助项目(2017BBE50063)