期刊文献+

基于半监督学习的数据流混合集成分类算法 被引量:1

Data Stream Mixture Ensemble Classification Algorithm Based on Semi-Supervised Learning
在线阅读 下载PDF
导出
摘要 当前已有的数据流分类模型都需要大量已标记样本来进行训练,但在实际应用中,对大量样本标记的成本相对较高。针对此问题,提出了一种基于半监督学习的数据流混合集成分类算法SMEClass,选用混合模式来组织基础分类器,用K个决策树分类器投票表决为未标记数据添加标记,以提高数据类标的置信度,增强集成分类器的准确度,同时加入一个贝叶斯分类器来有效减少标记过程中产生的噪音数据。实验结果显示,SMEClass算法与最新基于半监督学习的集成分类算法相比,其准确率有所提高,在运行时间和抗噪能力方面有明显优势。 The existing data stream classification algorithms require a large number of labeled data samples for training.But in prac-tical applications,the cost of labeling vast data is quite high.As for this problem, this paper proposed a data stream mixture ensem-ble classification algorithm based on semi-supervised learning-SMEClass that uses mixed mode to organize the base classifier. Firstly,using K C4.5 classifiers label the unlabeled data with the majority vote , which improves the label confidence of data and enhances the accuracy of ensemble classifier.What’s more,algorithm joins a Na?ve Bayes classifier to effectively reduce the noise in the process of labeling data.The experimental results showed that the accuracy of SMEClass algorithm is high compared with the latest semi-supervised ensemble classification algorithm.Especially,the SMEClass algorithm have obvious superiority in run-ning time and anti-noise ability.
作者 任钊婷 王治和 杨晏 REN Zhao-ting,WANG Zhi-he,YANG Yan (School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China)
出处 《电脑知识与技术》 2013年第12期7770-7775,7781,共7页 Computer Knowledge and Technology
关键词 数据流 半监督学习 集成分类 概念漂移 混合集成 data stream semi-supervised learning ensemble classification concept drifting mixture ensemble
  • 相关文献

参考文献17

  • 1Liao S H,Chu P H,Hsiao P Y. Data mining techniques and applications-A decade review from 2000 to 2011[J].{H}Expert systems with application,2012,(12):11303-11311.
  • 2Read J,Bifet A,Holmes G,PfahRINGER B. Scalable and efficient multi-label classification for evolving data streams[J].{H}Machine Learning,2012,(1-2):243-272.
  • 3白雪冰,王宝军.数据流分类算法分析[J].电脑知识与技术(过刊),2012,18(4X):2445-2446. 被引量:2
  • 4Zliobaite I. Learning under concept drift:an overview[OL].http://arxiv.org/pdf/1010.4784v1pdf,2009.
  • 5Widmer G,Kubat M. Learning in the presence of concept drift and hidden contexts[J].{H}Machine Learning,1996,(1):69-101.
  • 6Ho S-s,Wechsler H. A Martingale framework for detecting changes in data streams by testing exchange ability[J].{H}IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,(12):2113-2127.
  • 7Scholz M,Klinkenberg R. An Ensemble Classifier for Drifting Concepts[A].Portugal:Porto,2005.53-64.
  • 8Aggarwal C C,Han J,Wang J Y. A Framework for On-Demand Classification of Evolving Data Streams[J].{H}IEEE Transactions on Knowledge and Data Engineering,2006,(5):577-589.
  • 9Bieft A,Holmes G,Pfahringer B. New Ensemble Methods for Evolving Data Streams[A].France:Paris,2009.139-148.
  • 10Chapelle O,Scholkopf B,Zien A,editors. Semi-Supervised Learning[M].{H}Cambridge:The MIT Press,2006.

二级参考文献17

  • 1Han Jiawei,Kamber M. Data Mining:Concepts and Techniques[M].Singapore,Singapore:Elsevier,2006.
  • 2Wang Haixun,Fan Wei,Yu P S. Mining Concept-Drifting Data Streams Using Ensemble Classifiers[A].Washington DC USA,2003.226-235.
  • 3Aggarwal C. Data Streams:Models and Algorithms[M].Berlin,Germany:Springer-Verlag,2007.
  • 4Gehrke J,Ganti V,Ramakrishnan R. Boat-Optimistic Decision Tree Construction[A].Philadelphia USA,1999.169-180.
  • 5Domingos P,Hulten G. Mining High-Speed Data Streams[A].Boston,USA,2000.71-80.
  • 6Hulten G,Spencer L,Domingos P. Mining Time-Changing Data Streams[A].San Francisco,CA,USA,2001.97-106.
  • 7Scholz M,Klinkenberg R. An Ensemble Classifier for Drifting Concepts[A].Portugal,Porto,2005.53-64.
  • 8Aggarwal C C,Hat J,Wang Jianyong. A Framework for OnDemand Classification of Evolving Data Streams[J].IEEE Transactions on Knowledge and Data Engineering,2006,(05):577-589.
  • 9Masud M M,Gao Jing,Khan L. A Practical Approach to Classify Evolving Data Streams:Training with Limited Amount of Labeled Data[A].Pisa,Italy,2008.929-934.
  • 10Bifet A,Holmes G,Pfahringer B. New Ensemble Methods for Evolving Data Streams[A].France:Paris,2009.139-148.

共引文献18

同被引文献28

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部