期刊文献+

基于贝叶斯网络模型的支气管哮喘发病风险预测研究 被引量:1

Bayesian network model analysis based on risk prediction of bronchial asthma
原文传递
导出
摘要 目的联合应用类别型特征提升(categorical boosting,CatBoost)和随机森林算法筛选影响支气管哮喘(简称哮喘)发病的关键变量,并基于2种方法共同筛选出的变量构建贝叶斯网络模型,为哮喘风险预测和因果推断提供参考。方法利用英国生物样本库(UK Biobank,UKB)72105名研究对象的数据,通过CatBoost结合沙普利可加性解释(SHapley Additive exPlanations,SHAP)模型和随机森林进行变量重要性排序。选取2种算法中排序前15位的共同变量,基于最大最小爬山(max-min hill-climbing,MMHC)算法构建哮喘发病贝叶斯网络模型。采用受试者工作特征(receiver operating characteristic curve,ROC)曲线下面积(area under the curve,AUC)和准确性2个指标进行模型评价。结果新发哮喘共8532例,中位随访时间为7.1(4.0,10.3)年。研究中人群哮喘患病率为11.8%。CatBoost和随机森林2种算法共同筛选出12个变量,分别是性别、年龄、教育程度、BMI、吸烟、花粉症、嗜酸性粒细胞计数、呼吸道感染、慢性阻塞性肺疾病(chronic obstructive pulmonary disease,COPD)、过敏性鼻炎、支气管炎、环境灰尘。贝叶斯网络模型分析结果显示:嗜酸性粒细胞计数、花粉症、呼吸道感染、COPD、过敏性鼻炎、支气管炎、环境灰尘与哮喘发病直接相关;性别、年龄、教育程度、BMI、吸烟状态则间接影响哮喘发生风险。CatBoost训练集模型AUC为0.730,模型准确性为0.884;测试集模型AUC为0.710,模型准确性为0.883。随机森林精简模型的AUC为0.720,模型准确性为0.888。结论预防和控制过敏性疾病(花粉症、过敏性鼻炎)及呼吸系统疾病(呼吸道感染、COPD、支气管炎),控制嗜酸性粒细胞水平,避免接触环境灰尘,可降低哮喘发病风险。贝叶斯网络模型可用于预测哮喘发病风险。 Objective To identify key factors associated with bronchial asthma onset using both the CatBoost and random forest algorithms,and to subsequently construct a Bayesian network model based on the selected variables.This work aims to offer a foundation for asthma risk prediction andcausal inference.Methods Overall,72105 eligible individuals from UK Biobank(UKB)prospective cohort were included in this study.CatBoost with SHapley Additive exPLanations(SHAP)interpretation and random forest were used to rank variable importance.Common variables from the top 15 ranked by both algorithms were selected.An asthma risk Bayesian network model was constructed using the max-min hill-climbing(MMHC)algorithm.Model performance was evaluated using the area under curve(AUC)the receiver operating characteristic(ROC)and accuracy.Results During the follow-up period,8532 participants developed gout,with a follow-up time of 7.1(4.0,10.3)years.The asthma prevalence in this study population was 11.8%.Twelve common variables were identified:gender,age,education,BMI,smoking,hay fever,eosinophil count,respiratory infection,chronic obstructive pulmonary disease(COPD),allergic rhinitis,bronchitis,and environmental dust exposure,respectively.Bayesian network analysis result revealed that there were direct associations between asthma onset and eosinophil count,hay fever,respiratory infection,COPD,allergic rhinitis,bronchitis,and environmental dust exposure.However,gender,age,education,BMI,and smoking,only indirectly affected the risk of asthma.The CatBoost modelachieved AUCs of 0.730(training set)and 0.710(validation set),with accuracies of 0.884 and O.883,respectively.The reduced Random Forest model achieved an AUC of O.720 and accuracy of O.888.Conclusions Prevention andmanagement of allergic diseases(hay fever,allergic rhinitis)and respiratory diseases(respiratory infection,COPD,bronchitis),control of eosinophil levels,and avoidance of environmental dust exposure may reduce asthma risk.The Bayesian network model can used for predicting asthma onset.
作者 杨欣妤 王艳霞 马永华 宋旺辰 郭贵雅 王爱民 孔雨佳 王素珍 石福艳 YANG Xinyu;WANG Yanxia;MA Yonghua;SONG Wangchen;GUO Guiya;WANG Aimin;KONG Yujia;WANG Suzhen;SHI Fuyan(Department of Health Statistics,School of Public Health,Shandong Second Medical University,Weifang 261053,China)
出处 《中华疾病控制杂志》 北大核心 2025年第10期1187-1197,1205,共12页 Chinese Journal of Disease Control & Prevention
基金 国家自然科学基金(81803337,81872719,82003560) 山东省自然科学基金(ZR2023MH313) 山东省高等学校青创人才引育计划(No.2019-6-156,Lu-Jiao) 潍坊市科学技术发展计划(医学类)项目(2024YX042)。
关键词 支气管哮喘 CatBoost算法 随机森林 贝叶斯网络 风险预测 Asthma CatBoost algorithm Random forest Bayesian network Risk prediction
  • 相关文献

参考文献9

二级参考文献56

共引文献203

同被引文献15

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部