Automated grading of dandruff severity is a clinically significant but challenging task due to the inherent ordinal nature of severity levels and the high prevalence of label noise from subjective expert annotations.S...Automated grading of dandruff severity is a clinically significant but challenging task due to the inherent ordinal nature of severity levels and the high prevalence of label noise from subjective expert annotations.Standard classification methods fail to address these dual challenges,limiting their real-world performance.In this paper,a novel,three-phase training framework is proposed that learns a robust ordinal classifier directly from noisy labels.The approach synergistically combines a rank-based ordinal regression backbone with a cooperative,semi-supervised learning strategy to dynamically partition the data into clean and noisy subsets.A hybrid training objective is then employed,applying a supervised ordinal loss to the clean set.The noisy set is simultaneously trained using a dualobjective that combines a semi-supervised ordinal loss with a parallel,label-agnostic contrastive loss.This design allows themodel to learn fromthe entire noisy subset while using contrastive learning to mitigate the risk of error propagation frompotentially corrupt supervision.Extensive experiments on a new,large-scale,multi-site clinical dataset validate our approach.Themethod achieves state-of-the-art performance with 80.71%accuracy and a 76.86%F1-score,significantly outperforming existing approaches,including a 2.26%improvement over the strongest baseline method.This work provides not only a robust solution for a practical medical imaging problem but also a generalizable framework for other tasks plagued by noisy ordinal labels.展开更多
目的:基于机器学习算法构建人类免疫缺陷病毒/获得性免疫缺陷综合征(human immunodeficiency virus and acquired immunodeficiency syndrome,HIV/AIDS)患者合并马尔尼菲篮状菌(Talaromyces marneffei,TM)感染的诊断模型,以实现辅助早...目的:基于机器学习算法构建人类免疫缺陷病毒/获得性免疫缺陷综合征(human immunodeficiency virus and acquired immunodeficiency syndrome,HIV/AIDS)患者合并马尔尼菲篮状菌(Talaromyces marneffei,TM)感染的诊断模型,以实现辅助早期诊断和提升诊断灵敏度。方法:回顾性收集2020年1月至2023年9月在重庆市公共卫生医疗救治中心共201例HIV/AIDS-Mp1p抗原阳性患者实验室数据,筛选得到确诊TM(TM组)91例和未感染TM(非TM组)110例。将数据通过统计学分析获得2组间差异性指标,并构建logistic回归、随机森林分类器、决策树分类器模型。再通过最小绝对收缩和选择算子(least absolute shrinkage and selection operator,Lasso)回归筛选差异变量,并构建Lasso-logistic回归、随机森林和决策树分类器模型。分析所有模型筛选的高贡献特征指标,采用正确率、精确率、受试者工作特征(receiver operating characteristic,ROC)曲线和曲线下面积(area under the curve,AUC)评价模型的诊断性能。结果:通过TM组与非TM组检验指标比较,删除了白细胞计数和性别2个差异无统计学意义的指标(P>0.05),纳入红细胞计数、血小板计数、血红蛋白含量、C-反应蛋白、白细胞介素-6、降钙素原、年龄、CD4^(+)T淋巴细胞计数、HIV-RNA含量、(1-3)-β-D葡聚糖检测、曲霉半乳甘露聚糖抗原检测11个差异有统计学意义(P<0.05)的指标构建logistic回归模型、决策树和随机森林分类模型。进一步通过Lasso回归分析删除了CD4^(+)T淋巴细胞计数和红细胞计数指标,剩余9个变量纳入构建了Lasso-logistic回归、决策树和随机森林分类模型,其AUC均大于纳入11个变量构建的同种模型,其中随机森林分类模型(n=9)的诊断性能最佳,正确率为0.797、精确率为0.794、AUC=0.822(95%CI=0.719~0.924)。结论:在不同诊断模型中实验室检验指标特征重要性不同,经Lasso回归筛选变量后再构建模型能提高诊断性能,构建的所有模型中随机森林分类模型诊断性能最好。基于机器学习算法和临床检验数据建立诊断模型,有利于辅助临床早期诊断HIV/AIDS合并TM感染。展开更多
文摘Automated grading of dandruff severity is a clinically significant but challenging task due to the inherent ordinal nature of severity levels and the high prevalence of label noise from subjective expert annotations.Standard classification methods fail to address these dual challenges,limiting their real-world performance.In this paper,a novel,three-phase training framework is proposed that learns a robust ordinal classifier directly from noisy labels.The approach synergistically combines a rank-based ordinal regression backbone with a cooperative,semi-supervised learning strategy to dynamically partition the data into clean and noisy subsets.A hybrid training objective is then employed,applying a supervised ordinal loss to the clean set.The noisy set is simultaneously trained using a dualobjective that combines a semi-supervised ordinal loss with a parallel,label-agnostic contrastive loss.This design allows themodel to learn fromthe entire noisy subset while using contrastive learning to mitigate the risk of error propagation frompotentially corrupt supervision.Extensive experiments on a new,large-scale,multi-site clinical dataset validate our approach.Themethod achieves state-of-the-art performance with 80.71%accuracy and a 76.86%F1-score,significantly outperforming existing approaches,including a 2.26%improvement over the strongest baseline method.This work provides not only a robust solution for a practical medical imaging problem but also a generalizable framework for other tasks plagued by noisy ordinal labels.
文摘目的:基于机器学习算法构建人类免疫缺陷病毒/获得性免疫缺陷综合征(human immunodeficiency virus and acquired immunodeficiency syndrome,HIV/AIDS)患者合并马尔尼菲篮状菌(Talaromyces marneffei,TM)感染的诊断模型,以实现辅助早期诊断和提升诊断灵敏度。方法:回顾性收集2020年1月至2023年9月在重庆市公共卫生医疗救治中心共201例HIV/AIDS-Mp1p抗原阳性患者实验室数据,筛选得到确诊TM(TM组)91例和未感染TM(非TM组)110例。将数据通过统计学分析获得2组间差异性指标,并构建logistic回归、随机森林分类器、决策树分类器模型。再通过最小绝对收缩和选择算子(least absolute shrinkage and selection operator,Lasso)回归筛选差异变量,并构建Lasso-logistic回归、随机森林和决策树分类器模型。分析所有模型筛选的高贡献特征指标,采用正确率、精确率、受试者工作特征(receiver operating characteristic,ROC)曲线和曲线下面积(area under the curve,AUC)评价模型的诊断性能。结果:通过TM组与非TM组检验指标比较,删除了白细胞计数和性别2个差异无统计学意义的指标(P>0.05),纳入红细胞计数、血小板计数、血红蛋白含量、C-反应蛋白、白细胞介素-6、降钙素原、年龄、CD4^(+)T淋巴细胞计数、HIV-RNA含量、(1-3)-β-D葡聚糖检测、曲霉半乳甘露聚糖抗原检测11个差异有统计学意义(P<0.05)的指标构建logistic回归模型、决策树和随机森林分类模型。进一步通过Lasso回归分析删除了CD4^(+)T淋巴细胞计数和红细胞计数指标,剩余9个变量纳入构建了Lasso-logistic回归、决策树和随机森林分类模型,其AUC均大于纳入11个变量构建的同种模型,其中随机森林分类模型(n=9)的诊断性能最佳,正确率为0.797、精确率为0.794、AUC=0.822(95%CI=0.719~0.924)。结论:在不同诊断模型中实验室检验指标特征重要性不同,经Lasso回归筛选变量后再构建模型能提高诊断性能,构建的所有模型中随机森林分类模型诊断性能最好。基于机器学习算法和临床检验数据建立诊断模型,有利于辅助临床早期诊断HIV/AIDS合并TM感染。