摘要
目的系统分析淮安市结核病患者就诊延迟的核心影响因素,构建高精准度的就诊延迟预测模型,为江苏北部农业地区优化结核病防控策略提供流行病学依据。方法以2015—2024年淮安市结核病管理信息系统登记的17169例患者为研究对象,采用卡方检验、t检验进行基线与临床特征比较,通过多因素logistic回归分析明确就诊延迟的独立影响因素;同时运用随机森林(random forest,RF)、AdaBoost分类(adaptive boosting classification,AdaBoost)、梯度提升树(gradient boosting decision tree,GBDT)等6种机器学习算法构建预测模型,结合准确率、曲线下面积(area under the curve,AUC)等指标评估模型性能。结果淮安市结核病患者就诊延迟率为67.0%(11496/17169)。多因素logistic回归显示,实际服药管理方式[家庭成员管理vs医务人员管理:比值比(odds ratio,OR)=1.598(95%CI:1.159~2.203),P=0.004]、治疗模式[门诊治疗vs非门诊治疗:OR=4.129(95%CI:1.227~13.888),P=0.022]及症状模式(咯血相关:OR=0.172,P<0.001;无症状:OR=0.142,P<0.001)是就诊延迟的独立影响因素,而年龄、性别等人口学特征的影响无统计学意义。机器学习模型中,GBDT综合性能最优,准确率达0.948,AUC值为0.993,显著优于其他算法。结论实际服药管理方式、治疗模式及症状模式是影响淮安市结核病患者就诊时机的关键因素,本研究成功构建了6种机器学习算法模型,尤其是基于GBDT算法构建的预测模型可实现就诊延迟风险的精准评估,为区域结核病精准防控提供技术支撑。
Objective To systematically analyze the core influencing factors of medical consultation delay in tuberculosis(TB)patients in Huai'an City,and to construct a high-precision prediction model for medical consultation delay,thereby providing empirical evidence for optimizing TB prevention and control strategies in the agricultural areas of northern Jiangsu Province.Methods A total of 17169 patients registered in the Tuberculosis Management Information System of Huai'an City from 2015 to 2024 were selected as the research subjects.Chi-square test and t-test were used to compare baseline and clinical characteristics.Multivariate Logistic regression analysis was applied to identify the independent influencing factors of medical consultation delay.Meanwhile,six machine learning algorithms,including Random Forest(RF),Adaptive Boosting Classification(AdaBoost),and Gradient Boosting Decision Tree(GBDT),were used to construct prediction models.Model performance was evaluated using indicators such as accuracy and Area Under the Curve(AUC).Results The medical consultation delay rate of TB patients in Huai'an City was 67.0%(11496/17169).Multivariate Logistic regression showed that actual medication management method(family member management vs.medical staff management:OR=1.598,95%CI:1.159-2.203,P=0.004),treatment mode(outpatient treatment vs.non-outpatient treatment:OR=4.129,95%CI:1.227-13.888,P=0.022),and symptom pattern(hemoptysis-related:OR=0.172,P<0.001;asymptomatic:OR=0.142,P<0.001)were independent influencing factors of medical consultation delay,while demographic characteristics such as age and gender had no significant effects.Among the machine learning models,GBDT model demonstrated optimal comprehensive performance,with an accuracy of 0.948 and an AUC value of 0.993,significantly outperforming other algorithms.Conclusion The actual medication management method,treatment mode,and symptom pattern are the key targets for regulating the medical consultation timing of TB patients in Huai'an City.This study successfully constructed six machine learning algorithm models,especially the prediction model based on the Gradient Boosting Decision Tree(GBDT)algorithm,which enables accurate assessment of delayed-visit risk and provides technical support for the precise prevention and control of tuberculosis in the region.
作者
夏文玲
高强
缪巧玉
王玮明
丁守华
刘家松
XIA Wenling;GAO Qiang;MIAO Qiaoyu;WANG Weiming;DING Shouhua;LIU Jiasong(Huai'an Center for Disease Control and Prevention,Huai'an,Jiangsu 223001,China;Huaiyin Normal University,Huai'an,Jiangsu 223001,China)
出处
《中国热带医学》
北大核心
2026年第3期336-342,364,共8页
China Tropical Medicine
基金
国家自然科学基金项目(12571531)。
关键词
结核病
就诊延迟
影响因素
预测模型
机器学习
逻辑回归
Tuberculosis
medical delay
influencing factors
prediction model
machine learning
logistic regression