摘要
目的构建留守与非留守中学生自伤的风险预测模型,为制定针对性的干预措施提供科学依据。方法2021年9月―2023年6月采用多阶段抽样方法,在留守儿童分布相对集中的6个省份中抽取14623名<18岁的中学生(留守8471名,非留守6152名)作为研究对象。通过问卷调查收集研究对象的一般情况、创伤性事件和自伤发生情况。分析不同特征留守与非留守中学生自伤的发生情况。采用R 4.3.0软件按照7∶3的比例分别将留守与非留守中学生随机划分为训练集与测试集,构建logistic回归分析模型和随机森林模型,通过受试者工作特征曲线、灵敏度、特异度等指标评估模型性能。结果中学生自伤总体发生率为25.7%,留守中学生自伤发生率高于非留守中学生(χ^(2)=59.266,P<0.001)。Logistic回归分析模型分析结果显示,留守与非留守中学生预测模型训练集的曲线下面积(area under the curve,AUC)分别为0.745和0.756,测试集的AUC分别为0.721和0.726,Hosmer-Lemshow拟合优度检验P>0.05。随机森林模型中,留守中学生自伤的主要预测因素为经历创伤性事件、家庭氛围、和父亲/母亲关系等,模型的灵敏度、特异度、阳性预测值、阴性预测值和F1指数分别为0.740、0.591、0.822、0.470和0.779,Brier分数为0.212,训练集和测试集的AUC分别为0.800和0.729。非留守中学生则以经历创伤性事件、家庭氛围、父母感情状况等为主,模型的灵敏度、特异度、阳性预测值、阴性预测值和F1指数分别为0.785、0.519、0.850、0.411和0.816,Brier分数为0.188,训练集和测试集的AUC分别为0.845和0.724。结论留守中学生自伤风险高于非留守中学生,二者的预测因素虽有不同,但存在高度重叠,其中创伤经历和家庭因素是关键预测变量。两种模型对自伤的识别能力良好,但随机森林模型综合性能更优,本研究构建的预测模型可为早期识别高危人群提供科学依据。
Objective To construct risk prediction models for self-injury among left-behind and non-left-behind middle school students,and to provide a scientific basis for targeted interventions.Methods Between September 2021 to June 2023,a multistage sampling method was employed to select 14623 middle school students under the age of18(8471 left-behind and 6152 non-left-behind)from six provinces in China with a relatively high proportion of left-behind children.Data were collected via questionnaires,including demographic characteristics,traumatic experiences,and self-injury.The prevalence of self-injury between groups were compared.We used R software(version 4.3.0)to randomly divide the left-behind and non-left-behind middle school students into a training set and a test set at a ratio of 7:3.Logistic regression and random forest models were then constructed.Model performance was evaluated using metrics including the receiver operating characteristic curve,sensitivity,and specificity.Results The overall self-injury prevalence was 25.7%,with a significantly higher rate among left-behind students compared to non-left-behind students(χ^(2)=59.266,P<0.001).In the logistic regression models,the training set AUC values were 0.745 and 0.756 for left-behind and non-left-behind students,respectively.And the test set AUC values were 0.721 and 0.726,respectively.The Hosmer-Lemeshow goodness-of-fit test indicated a good model calibration(P>0.05).In the random forest models,the key predictors of self-injury among left-behind students included exposure to traumatic events,family atmosphere,and relationships with parents.The model achieved a sensitivity of 0.740,specificity of 0.591,positive predictive value(PPV)of 0.822,negative predictive value(NPV)of 0.470,F1-score of 0.779,and Brier score of 0.212.The AUC values for the training and test sets were 0.800 and 0.729,respectively.For non-left-behind students,the primary predictors were traumatic experiences,family atmosphere,and parental relationship status.The model showed a sensitivity of O.785,specificity of 0.519,PPV of 0.850,NPV of 0.411,F1-score of 0.816,and Brier score of 0.188,with training and test set AUC values of 0.845 and 0.724,respectively.Conclusions Left-behind middle school students are at higher risk of self-injury than their non-left-behind peers.Although predictive factors differed somewhat between the two groups,there was considerable overlap,with traumatic experiences and familyrelated factors identified as key predictors.Both models have demonstrated an acceptable discriminative ability for self-injury,and the random forest model showed superior overall performance.The prediction models developed in this study can serve as a scientific basis for the early identification of high-risk individuals.
作者
蔡铭
曾小朵
吴纤
向兵
杨梅
谢新艳
曾婧
CAI Ming;ZENG Xiaoduo;WU Xian;XIANG Bing;YANG Mei;XIE Xinyan;ZENG Jing(Research Center for Health Promotion in Women,Youth and Children,Hubei Provincial Key Laboratory of Occupational Hazard Identification and Control,School of Public Health,Wuhan University of Science and Technology,Wuhan 430065,China;Key Research Base of Humanities and Social Sciences in Hubei Colleges and Universities of"Healthy Hubei Construction and Social Development",Wuhan University of Sci ence and Technology,Wuhan 430065,China)
出处
《中华疾病控制杂志》
北大核心
2026年第3期294-302,共9页
Chinese Journal of Disease Control & Prevention
基金
国家社会科学基金(20BSH066)。
关键词
留守中学生
自伤
预测模型
LOGISTIC回归
随机森林
Left-behind middle school students
Self-injury
Prediction model
Logistic regression
Random forest