摘要
本文使用Stacking融合算法作为最终预警模型,预测信用卡用户次月违约的可能性.首先采用数据预处理与特征工程技术的方法,对数据集进行深入的处理和特征筛选,接着对数据进行平衡化处理,采用多种机器学习算法进行模型训练和优化,通过五折交叉验证法和网格搜索进行模型调参,确定模型中的最佳参数组合,引入4个模型评估指标,用于比较各分类模型的性能.对比指标取值后发现随机森林算法、AdaBoost算法、XGBoost算法和LightGBM算法的预测效果最好,进而用AdaBoost算法、XGBoost算法和LightGBM算法作为Stacking融合模型的基模型,用随机森林作为Stacking融合模型的元模型,构建一个两层Stacking融合模型.结果表明,Stacking融合模型的分类效果要优于单个分类模型.
With the rapid development of our country's economy,the credit card business has risen swiftly.For financial institutions,the risk of credit card delinquency is increasing.In this paper,we use the Stacking fusion algorithm as the final warning model to predict the possibility of credit card users defaulting in the following month.It employs data preprocessing and feature engineering techniques aimed at deep processing and feature selection of the dataset,followed by balancing the data.Various machine learning algorithms were used for model training and optimization,and the model parameters were tuned using 5-fold cross-validation and grid search to determine the best combination of parameters in the model.Four model evaluation metrics were introduced in order to compare the performance of different classification models.After comparing the values of these metrics,it was found that the Random Forest,AdaBoost,XGBoost,and LightGBM algorithms performed the best in prediction.Therefore,AdaBoost,XGBoost and LightGBM algorithms were used as the base models for the stacking fusion model,with random forest serving as the meta-model,constructing a twolayer stacking ensemble model.The results show that the classification effect of the stacking fusion model is much better than that of a single classification model.
作者
何道江
母远缘
HE Daojiang;MU Yuanyuan(School of Mathematics and Statistics,Anhui Normal University,Wuhu,Anhui 241002,China)
出处
《数学建模及其应用》
2025年第4期49-56,共8页
Mathematical Modeling and Its Applications
基金
安徽省高等学校省级质量工程重大教学研究项目(2023jyxm0151)。