摘要
以钢管约束混凝土(STCC)短柱为研究背景,聚焦于数据和特征的选择与前处理、模型的可视化应用以及特征重要性分析,探究机器学习“黑匣子”背后的预测过程。以154根圆STCC短柱为例,进行学习并预测其极限承载力N_(u)。讨论了STCC短柱结构中常见的9个特征的相关性以及冗余性,从13个机器学习模型中筛选出梯度提升树(GBDT)、随机森林(Random Forest)、极端梯度提升(XGBoost)和极端随机树(Extra Trees)四个最优模型对STCC的极限轴压承载力N_(u)进行预测,并采用SHAP可解释方法对4种模型进行可视化对比分析。研究表明:截面含钢率α在统计分析中方差趋于零且与径厚比B/t呈完全负相关关系;约束效应系数ζ在F检验中与N_(u)的显著性水平小于5%,斯皮尔曼、皮尔森以及互信息量相关性分析均表明其与N_(u)弱相关。通过SHAP方法对上述4种模型可视化发现,XGBoost在测试集上的表现尤为突出,其决定系数R^(2)(0.9626)、均方根误差(287.40 kN)、平均绝对误差(139.13 kN)以及平均绝对百分比误差(5.1%)均为4个模型中的最低值。此外,XGBoost在泛化能力和避免过拟合方面也表现出色,因此更适用于STCC短柱轴压承载力预测。
Based on the research background of steel tubular confined concrete(STCC)short columns,the selection and pre-processing of data and features,the visual application of models and the importance analysis of features were focused on in this paper,to explore the prediction process behind the“black box”of machine learning.The ultimate bearing capacity N_(u) was studied and predicted,based on the research results of 154 circular STCC short columns.The correlation and redundancy of 9 common features in STCC short column structures were discussed.Four optimal models of gradient lifting tree(GBDT),Random Forest,extreme gradient lifting(XGBoost)and Extra Trees were selected from 13 machine learning models to predict the ultimate bearing capacity N_(u) of STCC.The four models were compared and analyzed visually by SHAP interpretable method.The results show that the difference of steel contentαin the statistical analysis tends to 0 and has a completely negative correlation with the diameter thickness ratio B/t.The significance level of constraint effect coefficientζwith N_(u) in F test is less than 5%,which is weakly correlated with N_(u) by Spearman,Pearson and mutual information correlation analysis.Through the visualization of the above four models by SHAP method,XGBoost performs particularly well on the test set,with the coefficient of determination R^(2)(0.9626),root mean square error(287.40 kN),mean absolute error(139.13 kN),and mean absolute percentage error(5.1%)among the lowest values of the four models.In addition,XGBoost also performs well in terms of generalization ability and avoidance of overfitting,so it is more suitable for the axial compression capacity prediction of STCC short columns.
作者
韦建刚
吴洵桢
郑裔
杨艳
WEI Jiangang;WU Xunzhen;ZHENG Yi;YANG Yan(College of Civil Engineering,Fuzhou University,Fuzhou 350116,China;College of Civil Engineering,Fujian University of Technology,Fuzhou 350118,China;Zhicheng College,Fuzhou University,Fuzhou 350002,China)
出处
《东南大学学报(自然科学版)》
北大核心
2025年第5期1328-1336,共9页
Journal of Southeast University:Natural Science Edition
基金
国家自然科学基金资助项目(52278158)
福建省高校产学研联合创新资助项目(2022H6009).
关键词
机器学习
特征工程
SHAP解释方法
圆钢管约束混凝土
轴压承载力
特征重要性分析
machine learning
feature engineering
SHAP interpretation method
steel tube confined concrete
axial compression capacity
feature importance analysis