期刊文献+

一种基于逻辑判别式的稀有类分类方法 被引量:23

Logistic Discrimination Based Rare-class Classification Method
在线阅读 下载PDF
导出
摘要 基于逻辑判别式(LD,Logistic Discrimination),提出一种叫做LDRC(LD based Rare-class Classification)方法用于提升LD在稀有类问题中的泛化性能.为了充分考虑稀有类的特性,构建了一种新目标函数RPM(Recall and Precision based M etric),其同时考虑正类和负类的召回率以及正类的精度,其中正类和负类的召回率用于保障模型在评估指标召回率以及g-mean(正类和分类的召回率的几何平均数)上具有较高的泛化能力,正类的召回率和精度用于保障了模型具有较高的准确率以及fmeasure值(基于正类召回率与精度的指标).LDRC使用RPM作为目标函数监督参数学习过程,以保障LDRC具有较高的整体泛化能力.UCI数据集上的实验结果表明,与传统的逻辑判别、基于过采样和基于欠采样的逻辑判别相比,LDRC模型在评价指标召回率、g-mean和f-measure上都表现出明显优势. Based on LD ( Logistic Discrimination ), we provide a novel method called LDRC ( LD based Rare-class Classification ) toenhance the generalization performance of LD on rare-class problem. Take full use of the character of rare-class, we consuct a newobjective function RPM ( Metric based on Recall and Precision) which take into account the recall of both positive and negative classas well as the precision of positive class. The recall of both positive and negative class guarantee LDRC has a better generalization per-formance on recall and g-mean while the precision and recall of positive class ensure LDRC has better generalization performance onaccuracy and f-measure. LDRC learn the parameter with the objective function RPM to get better performance. The experiments onUCI data sets show that the proposed method presents significant advantage comparing to LD,LD based on Under-Sample and Over-Sample on measures of recall, g-mean and f-measure.
出处 《小型微型计算机系统》 CSCD 北大核心 2016年第1期140-145,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金(61202194 61402393 61572417)资助 河南省教育厅科学技术研究项目(14A520016 14B520045 12A520035)资助
关键词 稀有类 逻辑判别 召回率 精度 分类 rare-class logistic discrimination recall precision classification
  • 相关文献

参考文献13

  • 1He Hai-bo ,Edwardo A Garcia. Learning from imbalanced data[ J]. IEEE Transactions on Knowledge and Data Engineering, 2009,21 (9) :1263-1284.
  • 2Yao Pei, Wang Zhong-sheng, Jiang Hong-kai, et al. Fault diagnosis method based on cs-boosting for unbalanced training data[ J ]. Journal of Vibration, Measurement & Diagnosis,2013,33 ( 1 ) : 111-115.
  • 3Powers David Martin. Evaluation: from precision, recall and Fmeasure to ROC, informedness, markedness and correlation [ J ]. Journal of Machine Learning Technologies ,2011,2 ( 1 ) :37-63.
  • 4Shao Kuoyi, Zhai Yun, Sui Hai-feng et al. A new over-sample method based on distribution density [ J ]. Journal of Computers, 2014,9(2) :483-490.
  • 5Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, et al. Smote: synthetic minority over-sampling technique [J]. Journal of Artificial Intelligence Research,2002,16( 1 ) :321-357.
  • 6Claudia Galarda Varassin, Alexandre Plastino, Helena Cristina Da Gama Leitao, et al. Undersampling strategy based on clustering to improve the performance of splice site classification in human genes in database and expert systems applications[ C]. 24th IEEE Interna- tional Workshop on DEXA,2013:85-89.
  • 7Mahesh V Joshi, Ramesh C Agarwal,Vipin Kumar. Mining needles in a haystack:classifying rare classes via two-phase rule induction [ C ]. Proceedings of the ACM SIGMOD International Conference on Management of Data,2001:91-102.
  • 8Zhang Yin, Zhou Zhi-hua. Cost-sensitive face recognition [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence ,2010,32 (10) : 1758-1769.
  • 9Liu Xu-ying, Zhou Zhi-hua. The influence of class imbalance on cost-sensitive learning : an empirical study [ C ]. Proceeding of Sixth International Conference on Data Mining ,2006:970-974.
  • 10Sergios Theodoridis, Konstantinos Koutrournbas. Pattern recognition ( Third Edition ) [ M]. China Machine Press,2006 : 91-93.

同被引文献135

引证文献23

二级引证文献114

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部