期刊文献+

基于可解释性机器学习的宫颈癌预后模型的构建和验证

Development and validation of a cervical cancer prognostic model based on explainable machine learning
原文传递
导出
摘要 目的通过机器学习构建宫颈癌患者总生存期的预后模型。方法收集2009年1月至2014年12月在宁波大学附属阳明医院妇产科住院手术的宫颈癌患者共336例,将患者以8∶2的比例分为训练集(n=268)和测试集(n=68)。采用最小绝对收缩和选择运算符(LASSO)、决策树(DT)、随机森林(RF)、极端梯度提升(XGBoost)、支持向量机(SVM)、神经网络(NN)6种机器学习开发预后模型,通过C指数、受试者操作特征(ROC)曲线、校准曲线和决策曲线分析(DCA)来评估和比较模型,并进行外部验证。通过SHAP图来展示每个因素对模型预测的贡献度。结果根据Boruta算法从15个因素中筛选出8个因素包括年龄、肿瘤大小、分期、分化、宫颈间质浸润深度、淋巴结有无转移、放疗及化疗。在6种机器学习模型中,发现XGBoost模型相比其他模型具有更好的预测性能及临床实用性,训练集C指数0.936(95%CI:0.887~0.980),3年、5年及10年ROC曲线下面积(AUC)分别为0.935、0.963、0.980,测试集C指数0.918(95%CI:0.893~0.940),3年、5年及10年AUC分别为0.937、0.946、0.936,验证集C指数0.912(95%CI:0.881~0.938),3年、5年及10年AUC分别为0.948、0.947、0.936。结论XGBoost被证明是一种具有良好预测能力的模型,能够为宫颈癌患者预后随访和治疗提供更好的指导和决策。 Objective A prognostic model of overall survival of cervical cancer patients was constructed through machine learning.Methods A total of 336 cervical cancer patients who underwent surgery in the Department of Obstetrics and Gynecology at Yangming Hospital Affiliated to Ningbo University between January 2009 and December 2014 were enrolled.The patients were divided into a training set(n=268)and a test set(n=68)in an 8:2 ratio.Six machine learning algorithms—least absolute shrinkage and selection operator(LASSO),decision tree(DT),random forest(RF),extreme gradient boosting(XGBoost),support vector machine(SVM),and neural network(NN)—were employed to develop prognostic models.The models were evaluated and compared using the concordance index(C-index),receiver operating characteristic(ROC)curves,calibration curves,and decision curve analysis(DCA),followed by external validation.Results Based on the Boruta algorithm,eight prognostic factors were selected from 15 candidates:Age,tumor size,stage,differentiation,depth of cervical stromal invasion,lymph node metastasis,radiotherapy,and chemotherapy.Among the six machine learning models,the XGBoost model demonstrated superior predictive performance and clinical utility.In the training set,the C-index was 0.936(95%CI:0.8870.980),with areas under the ROC curves(AUCs)for 3-,5-,and 10-year survival being 0.935,0.963,and 0.980,respectively.In the test set,the C-index was 0.918(95%CI:0.8930.940),with 3-,5-,and 10-year AUCs of 0.937,0.946,and 0.936,respectively.In the external validation set,the C-index was 0.912(95%CI:0.8810.938),with corresponding AUCs of 0.948,0.947,and 0.936.Using SHAP(Shapley Additive Explanation)plots to visualize the contribution of each feature to the model's predictions.Conclusion The XGBoost model exhibits strong predictive capability and may provide valuable guidance for prognosis follow-up and treatment decision-making in cervical cancer patients.
作者 施姚 王兰英 汪棋秦 徐佳楠 王思远 SHI Yao;WANG Lanying;WANG Qiqin;XU Jia'nan;WANG Siyuan(Department of Obstetrics and Gynecology,Yangming Hospital Affiliated to Ningbo University,Yuyao,Zhejiang 315400,China)
出处 《中国优生与遗传杂志》 2025年第10期2183-2191,共9页 Chinese Journal of Birth Health & Heredity
基金 余姚市科技局重点项目(2023YZD01)。
关键词 宫颈癌 受试者操作特征曲线 校正曲线 决策曲线 SHAP图 cervical cancer ROC curve calibration plots DCA SHAP plots
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部