摘要
鼻炎(Rhinitis)是上呼吸道常见的慢性炎症,具有多种证型和体征。鼻炎临床分类具有样本类型多、类别不平衡特征,属于多输出分类范畴,常出现少数类样本识别率低、综合分类精度差的问题。为此,本文提出异质集成结构分类算法,将鼻炎多输出分类转化为多标签和多类别分类,采用集成学习算法构建异质集成分类器。该方法可根据子数据集中单一类标的不平衡度,自动调节集成森林基学习器数量和深度,有效减少不均衡样本对分类的影响,提高多数类和少数类的总体分类精度,进而提升集成模型的泛化能力。针对临床461例鼻炎样本进行交叉验证分类实验,本文分类模型灵敏度为74.9%,特异性为86.5%,准确度为92.0%,F1为0.783,AUC为0.953。与6种典型模型相比,本文模型具有更好的评估性能,更适合于鼻炎的早期临床诊断。
Rhinitis is a common chronic inflammation of the upper respiratory tract with a variety of symptoms and signs.The clinical classification of rhinitis is characterized by different types of instances and class imbalance,and belongs to multiple output classification.Low recognition rate and poor generalization performance often occur for minority class instances.Therefore,this article proposes a novel classification model based on heterogeneous integrated frame,which translates the multi-output classification of rhinitis to multi-label and multi-class classification,then builds a heterogeneous integrated classifier by ensemble learning algorithm.The proposed model can automatically adjust the number and depth of integrated forest learners according to the imbalance ratio of single class label in a subset.As a result,it can effectively reduce influence of class imbalance and improve classification performance of majority and minority class concurrently,further to enhance generalization of integrated classifiers.We conduct cross-validation classification experiments on 461 cases of clinical rhinitis.The outcomes show that the evaluation indicators of the proposed model,such as sensitivity,specificity,accuracy,F1 and AUC,are 74.9%,86.5%,92.0%,0.783 and 0.953,respectively.In comparison to other baseline methods,it achieves better evaluation performance and is more suitable for the early clinical diagnosis of rhinitis.
作者
杨晶东
孟一飞
荀镕基
余少卿
YANG Jingdong;MENG Yifei;XUN Rongji;YU Shaoqing(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China;Department of Otorhinolaryngology,Head and Neck Surgery,Tongji Hospital of Tongji University,Shanghai 200065,China)
出处
《数据采集与处理》
CSCD
北大核心
2021年第4期684-696,共13页
Journal of Data Acquisition and Processing
基金
国家自然科学基金(81973749,8187040043)资助项目
上海市卫生健康委先进适宜技术推广(2019SY071)资助项目
上海市科委中医引导类(18401903600)资助项目
上海市卫计委科研面上(201740093)资助项目。
关键词
变应性鼻炎
集成学习
基学习器
多标签分类
异质结构
allergic rhinitis
ensemble learning
base learner
multi-label classification
heterogeneous structure