期刊文献+

基于机器学习的自发性脑出血患者转归可解释预测模型的构建和验证

Development and validation of a machine learning-based explainable prediction model for the outcome of patients with spontaneous intracerebral hemorrhage
原文传递
导出
摘要 目的评价表格先验数据拟合网络(Tabular Prior-data Fitted Network,TabPFN)对自发性脑出血(spontaneous intracerebral hemorrhage,sICH)患者短期转归的预测价值,并与极限梯度提升(eXtreme Gradient Boosting,XGboost)模型和传统logistic回归(logistic regression,LR)模型进行比较。方法回顾性纳入2018年1月至2024年3月期间合肥市第二人民医院神经内科收治的sICH患者,收集患者人口统计学和基线资料。发病后3个月时采用改良Rankin量表评分判定转归,0~2分为转归良好,>2分为转归不良。所有纳入的患者按7∶3随机分为训练集和验证集。通过递归特征消除(recursive feature elimination,RFE)方法进行特征选择,然后将选定的特征变量纳入TabPFN、XGboost和LR模型,并进行训练和测试。应用受试者工作特征(receiver operating characteristic,ROC)曲线下面积(area under the curve,AUC)评估模型的预测能力。采用夏普利加性解释(Shapley additive explanations,SHAP)方法进行模型解释。结果共纳入547例sICH患者,男性367例(67.1%),中位年龄65岁(四分位数间距54~76岁),226例(41.3%)转归不良。采用RFE筛选出年龄、基线血压(收缩压、舒张压)、基线实验室检查(白细胞计数、红细胞计数、血小板计数、中性粒细胞计数、血红蛋白、空腹血糖、肌酐、尿酸、尿素氮、谷丙转氨酶、谷草转氨酶)、血肿破入脑室、岛征、基线血肿体积及基线美国国立卫生研究院卒中量表(National Institutes of Health Stroke Scale,NIHSS)评分为特征变量。ROC曲线分析显示,TabPFN、Xgboost和LR模型在测试集中预测短期转归不良的ROC曲线下面积分别为0.918[95%置信区间(confidence interval,CI)0.870~0.966]、0.883(95%CI 0.826~0.940)和0.905(95%CI 0.854~0.957)。SHAP分析显示,TabPFN模型中重要性排名前4位的变量分别为基线NIHSS评分、基线血肿体积、基线谷草转氨酶和年龄。结论TabPFN模型预测sICH患者转归不良的能力优于LR模型和XGBoost模型;在TabPFN模型中,基线NIHSS评分、基线血肿体积、谷草转氨酶和年龄是sICH患者转归不良的最重要预测因素。 Objectives:To evaluate the predictive value of Tabular Prior-data Fitted Network(TabPFN)for short-term outcome in patients with spontaneous intracerebral hemorrhage(sICH),and compared with the Extreme Gradient Boosting(XGboost)model and traditional logistic regression(LR)model.Methods:Patients with sICH admitted to the Department of Neurology,Hefei Second People's Hospital from January 2018 to March 2024 were included retrospectively.The demographic and baseline data were collected.At 3 months after onset,the modified Rankin Scale score was used to determine the outcome,0-2 was defined as good outcome and>2 was defined as poor outcome.All enrolled patients were randomly divided into a training set and a testing set at a ratio of 7:3.Feature selection was performed using recursive feature elimination(RFE)method,and then the selected feature variables were included into TabPFN,XGboost,and LR models for training and testing.The area under the curve(AUC)of receiver operating characteristic(ROC)curve was used to evaluate the predictive ability of the models.Shapley additive explanations(SHAP)method was used for model interpretation.Results:A total of 547 patients with sICH were enrolled,including 367 males(67.1%),with a median age of 65(interquartile range,54-76)years.Two hundred twenty-six patients(41.3%)had poor outcome.Age,baseline blood pressure(systolic blood pressure,diastolic blood pressure),baseline laboratory tests(white blood cell count,red blood cell count,platelet count,neutrophil count,hemoglobin,fasting blood glucose,creatinine,uric acid,urea nitrogen,alanine aminotransferase,aspartate aminotransferase),hematoma rupture into the ventricle,island sign,baseline hematoma volume,and baseline National Institutes of Health Stroke Scale(NIHSS)score were selected as characteristic variables using RFE method.ROC curve analysis showed that the ROC AUC for TabPFN,Xgboost,and LR models predicting poor short-term outcome in the testing set were 0.918(95%confidence interval[CI]0.870-0.966],0.883(95%CI 0.826-0.940),and 0.905(95%CI 0.854-0.957),respectively.SHAP analysis showed that the top four important variables in the TabPFN model were baseline NIHSS score,baseline hematoma volume,baseline aspartate aminotransferase,and age.Conclusions:The TabPFN model is superior to the LR model and the XGBoost model in predicting poor outcome in patients with sICH.In the TabPFN model,baseline NIHSS score,baseline hematoma volume,aspartate aminotransferase,and age are the most important predictors of poor outcome in patients with sICH.Objectives To evaluate the predictive value of Tabular Prior-data Fitted Network(TabPFN)for short-term outcome in patients with spontaneous intracerebral hemorrhage(sICH),and compared with the Extreme Gradient Boosting(XGboost)model and traditional logistic regression(LR)model.Methods Patients with sICH admitted to the Department of Neurology,Hefei Second People's Hospital from January 2018 to March 2024 were included retrospectively.The demographic and baseline data were collected.At 3 months after onset,the modified Rankin Scale score was used to determine the outcome,0-2 was defined as good outcome and>2 was defined as poor outcome.All enrolled patients were randomly divided into a training set and a testing set at a ratio of 7:3.Feature selection was performed using recursive feature elimination(RFE)method,and then the selected feature variables were included into TabPFN,XGboost,and LR models for training and testing.The area under the curve(AUC)of receiver operating characteristic(ROC)curve was used to evaluate the predictive ability of the models.Shapley additive explanations(SHAP)method was used for model interpretation.Results A total of 547 patients with sICH were enrolled,including 367 males(67.1%),with a median age of 65(interquartile range,54-76)years.Two hundred twenty-six patients(41.3%)had poor outcome.Age,baseline blood pressure(systolic blood pressure,diastolic blood pressure),baseline laboratory tests(white blood cell count,red blood cell count,platelet count,neutrophil count,hemoglobin,fasting blood glucose,creatinine,uric acid,urea nitrogen,alanine aminotransferase,aspartate aminotransferase),hematoma rupture into the ventricle,island sign,baseline hematoma volume,and baseline National Institutes of Health Stroke Scale(NIHSS)score were selected as characteristic variables using RFE method.ROC curve analysis showed that the ROC AUC for TabPFN,Xgboost,and LR models predicting poor short-term outcome in the testing set were 0.918(95%confidence interval[CI]0.870-0.966],0.883(95%CI 0.826-0.940),and 0.905(95%CI 0.854-0.957),respectively.SHAP analysis showed that the top four important variables in the TabPFN model were baseline NIHSS score,baseline hematoma volume,baseline aspartate aminotransferase,and age.Conclusions The TabPFN model is superior to the LR model and the XGBoost model in predicting poor outcome in patients with sICH.In the TabPFN model,baseline NIHSS score,baseline hematoma volume,aspartate aminotransferase,and age are the most important predictors of poor outcome in patients with sICH.
作者 岳宏 耿直 余招平 张持 刘学春 吴君仓 武爱梅 Yue Hong;Geng Zhi;Yu Zhaoping;Zhang Chi;Liu Xuechun;Wu Juncang;Wu Aimei(Department of Neurology,the Affiliated Hefei Hospital of Anhui Medical University(Hefei Second People's Hospital),Hefei 2300ll,China;Department of Neurology,the First Affiliated Hospital of Anhui MedicalUniversity,Hefei230022,China)
出处 《国际脑血管病杂志》 2025年第6期420-428,共9页 International Journal of Cerebrovascular Diseases
基金 2024年度蚌埠医科大学自然科学重点项目(2024byzd396) 2022年度合肥市二院院级科研项目(2022yyb005)。
关键词 脑出血 治疗结果 机器学习 体层摄影术 X线计算机 危险因素 Cerebral hemorrhage Treatment outcome Machine learning Tomography,X-ray computed Risk factors
  • 相关文献

参考文献4

二级参考文献21

共引文献478

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部