目的基于血常规数据开发结直肠癌预测模型,以提高结直肠癌筛查参与率,降低筛查成本,提高结直肠癌筛查准确性。方法收集4554例患者依据结肠镜和病理结果分为正常对照组(1899例)、息肉组(2259例)和结直肠癌组(396例)。按7∶3比例分别将各...目的基于血常规数据开发结直肠癌预测模型,以提高结直肠癌筛查参与率,降低筛查成本,提高结直肠癌筛查准确性。方法收集4554例患者依据结肠镜和病理结果分为正常对照组(1899例)、息肉组(2259例)和结直肠癌组(396例)。按7∶3比例分别将各组随机分为训练集和验证集,采用梯度提升树机器学习算法分别建立癌vs非癌、癌vs正常对照、癌vs息肉和息肉vs正常对照预测模型。模型基于曲线下面积(area under the curve,AUC)进行评估。结果构建梯度提升树模型,AUC作为模型准确性评价指标,上述4个模型在验证集的AUC值依次为0.891、0.921、0.871和0.772,三个结直肠癌预测模型AUC值均>0.80。结论本研究所开发的结直肠癌预测模型可鉴别验证数据集中的癌与非癌人群,模型的泛化能力还需外部独立数据集进一步验证。展开更多
In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regressio...In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regression tree approach [1]?[2]. This approach has made it possible to highlight the existence of several segments of the population of interest described by the interactions between the predictive covariates of the response to the treatment regimen.展开更多
文摘目的基于血常规数据开发结直肠癌预测模型,以提高结直肠癌筛查参与率,降低筛查成本,提高结直肠癌筛查准确性。方法收集4554例患者依据结肠镜和病理结果分为正常对照组(1899例)、息肉组(2259例)和结直肠癌组(396例)。按7∶3比例分别将各组随机分为训练集和验证集,采用梯度提升树机器学习算法分别建立癌vs非癌、癌vs正常对照、癌vs息肉和息肉vs正常对照预测模型。模型基于曲线下面积(area under the curve,AUC)进行评估。结果构建梯度提升树模型,AUC作为模型准确性评价指标,上述4个模型在验证集的AUC值依次为0.891、0.921、0.871和0.772,三个结直肠癌预测模型AUC值均>0.80。结论本研究所开发的结直肠癌预测模型可鉴别验证数据集中的癌与非癌人群,模型的泛化能力还需外部独立数据集进一步验证。
文摘In this paper we aim to analyse temporal variation of CD4 cell counts for HIV-infected individuals under antiretroviral therapy by using statistical methods. This is achieved by resorting to recursive binary regression tree approach [1]?[2]. This approach has made it possible to highlight the existence of several segments of the population of interest described by the interactions between the predictive covariates of the response to the treatment regimen.