使用聚类来加速AdaBoost并实现噪声数据探测被引量：2

Using Clustering to Speed up AdaBoost and Detecting Noisy Data

下载PDF

导出

摘要决定集成学习性能的主要因素是集成的个体学习器之间的差异性.使用聚类技术来加速AdaBoost.在不同噪声水平环境下,新算法的性能都接近AdaBoost.对AdaBoost噪声敏感问题提出了新的解决思路,使用该项技术可以实现快速的噪声探测和噪声剔除后的再学习,从而在对含噪声数据基进行处理时,在综合性能和效率上都明显优于AdaBoost. According as the main factor deciding the performance of ensemble learning is the diversity of component learners, clustering technology is used to speed up AdaBoost in this paper. The performance of the new algorithm is very close to the AdaBoost in learning deferent noise levels data sets. The new algorithm can be used to detect and eliminate noisy data quickly, and could achieve rapid learning on data sets after eliminating noise. It overcomes the noise-sensitive shortcoming of AdaBoost. The general performance and efficiency of the new algorithm are much better than AdaBoost in processing data sets containing noise.

作者谢元澄杨静宇

机构地区南京农业大学信息科学技术学院南京理工大学计算机科学与技术学院

出处《软件学报》 EI CSCD 北大核心 2010年第8期1889-1897,共9页 Journal of Software

基金国家自然科学基金No.60632050~~

关键词 ADABOOST 聚类个体学习器 BP神经网络加速噪声检测 AdaBoost clustering component learner BP neural network speed up noise detection

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献17

1Schapire RE.The boosting approach to machine learning:An overview.In:Denison DD,Hansen MH,Holmes C,Mallick B,Yu B,eds.Proc.of the Mathematical Sciences Research Institute (MSRI) Workshop on Nonlinear Estimation and Classification.Berlin,Heidelberg:Springer-Verlag,2002.149-172.
2Kearns M,Valiant L.Cryptographic limitations on learning Boolean formulae and finite automata.Journal of the ACM,1994,4l(1):67-95.[doi:10.1145/174644.174647].
3Freund Y,Schapire RE.A decision-theoretic generalization of on-line learning and an application to Boosting.Journal of Computer and System Sciences,1997,55(1):11-139.
4Dietterich TG.An experimental comparison of three methods for constructing ensembles of decision trees:Bagging,boosting,and randomization.Machine Learning,1999,40(2):139-157.[doi:10.1023/A:1007607513941].
5Viola P,Jones M.Rapid object detection using a boosted cascade of simple features.In:Proc.of the IEEE Conf.on Computer Vision and Pattern Recognition.New York:IEEE Press,2001.511-518.
6Guo ZB.Research on the algorithm of fast face detection and feature extraction[Ph.D.Thesis].Nanjing:Nanjing University of Science and Technology,2007 (in Chinese with English abstract).
7Krogh A,Vedelsby J.Neural network ensembles,cross validation,and active learning.In:Tesauro G,Touretzky DS,Leen TK,eds.Advances in Neural Information Processing Systems 7.Cambridge:MIT Press,1995.231-238.
8Zhou ZH,Wu JX,Tang W.Ensembling neural networks:Many could be better than all.Artificial Intelligence,2002,137(1-2):239-263.[doi:10.1016/S0004-3702(02)00190-X].
9Giacinto G,Roli F.An approach to the automatic design of multiple classifier systems.Pattern Recognition Letters,2001,22(1):25-33.[doi:10.1016/S0167-8655(00)00096-9].
10Bakker B,Heskes T.Clustering ensembles of neural network models.Neural Networks,2003,16(2):261-269.[doi:10.1016/S0893-6080(02)00187-9].

同被引文献16

1武勃,黄畅,艾海舟,劳世竑.基于连续Adaboost算法的多视角人脸检测[J].计算机研究与发展,2005,42(9):1612-1621. 被引量：66
2姜远,周志华.基于词频分类器集成的文本分类方法[J].计算机研究与发展,2006,43(10):1681-1687. 被引量：22
3Freund Y, Schapire R E. Experiments with a new boosting algorithm [C]//Proc of the 13th Conf on Machine Learing. Bari, Italy: Morgan Kaufmann, 1996: 148--156.
4Yoav Freund , Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting [J]. Journal of Computer and System Sciences, 1997, 55 (1): 119--139.
5Kearns M, Valiant L G. Cryptographic limitations on learning boolean formulae and finite automata [J]. Journal oftheACM, 1994, 41 (1): 67--95.
6Schapire R E, Singer Y. Improved boosting algorithms using confidence-rated predictions [J]. Machine Learning, 1999, 37 (3): 297-336.
7Friedman J, Hastie T, Tibshirani R. Additive logistic regession: a statistical view of boosting [J]. The Annals of Statistics, 2000, 28 (2): 337--407.
8Suen C Y, Lam L. Multiple classifier combination methodologies for different output levels [C]//Proc of the 1st International Workshop on Multiple Classifier Systems. London: Springer-Verlag, 2000.
9Drucker H. Effect of pruning and early stopping on performance of a boosting ensemble [J]. Computational Statistics & Data Analysis, 2002, 38 (4): 393--406.
10Ridgeway G, Madigan D, Richardson T. Boosting methodology for regression problems [C] //Proc of the 7th International Workshop on Artificial Intelligence and Statistics. Fort Lauderdate, Florida, 1999: 152--161.

引证文献2

1严晓明.基于错分样本权重约束的AdaBoost算法改进[J].福建师范大学学报（自然科学版）,2011,27(6):23-26. 被引量：2
2宋伟,张帆,叶阳东,韩鹏,范明.基于SAX方法的时间序列分类问题的多阶段改进研究[J].计算机工程与科学,2016,38(5):988-996. 被引量：5

二级引证文献7

1胡佳利,王威娜.基于子类聚类和SAX表示的Shapelet快速发现算法[J].吉林化工学院学报,2022,39(11):20-24.
2刘余霞,吕虹,胡涛,孙小虎.基于特征加权模板快速提升的AdaBoost车牌字符识别算法[J].安徽工程大学学报,2012,27(4):45-48.
3姬波,叶阳东,卢红星.基于样本权重的出租车聚集区识别算法[J].计算机应用,2013,33(5):1338-1342. 被引量：1
4宋伟,宋玉,张帆,范明,叶阳东.基于排列熵的SAX特征表示方法复杂度及相关特性研究[J].计算机工程与科学,2018,40(7):1303-1309. 被引量：2
5武天鸿,翁小清,单中南.基于符号表示的时间序列分类综述[J].河北省科学院学报,2019,36(3):11-20. 被引量：2
6武天鸿,翁小清,单中南.基于LDA符号表示的时间序列分类算法[J].计算机应用与软件,2020,37(2):259-265. 被引量：6
7唐俊熙,曹华珍,高崇,吴亚雄,石颖.一种基于时间序列数据挖掘的用户负荷曲线分析方法[J].电力系统保护与控制,2021,49(5):140-148. 被引量：36

1叶汶华.气象数据监控系统的设计与实现[J].电子技术与软件工程,2016(6):185-185. 被引量：5
2邓文韬.基于几何特征加权和选择的数据空间聚类算法研究[J].信息技术与信息化,2014(12):67-69. 被引量：2
3数据备份与安全给数据找个安全的家[J].新电脑,2010(9):124-127.
4王维国.移动电子商务的安全解决方案[J].现代通信,2005(10):57-57. 被引量：3
5张彦霞,赵永恒.离群数据的探测[J].天文学进展,2004,22(1):1-9. 被引量：3
6王冬云,张荣辉,王诗施,胡永举.高速公路车道标识线边界提取改进算法[J].中国科技纵横,2011(9):406-406.
7陈鹏.西部数据的大跃进[J].电脑时空,2008(4):45-45.
8邓攀,王勇.IPv6网络拓扑发现方法研究[J].桂林航天工业高等专科学校学报,2008,13(2):19-21.
9刘小明.水下传感网络拓扑部署[J].科技资讯,2014,12(25):45-46.
10姜红德.重庆：数据外包产业触“云”[J].中国信息化,2012(6):34-35.

软件学报

2010年第8期

浏览历史

内容加载中请稍等...

使用聚类来加速AdaBoost并实现噪声数据探测被引量：2

参考文献17

同被引文献16

引证文献2

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

使用聚类来加速AdaBoost并实现噪声数据探测 被引量：2

参考文献17

同被引文献16

引证文献2

二级引证文献7

相关作者

相关机构

相关主题

浏览历史

使用聚类来加速AdaBoost并实现噪声数据探测被引量：2