期刊文献+

基于多分类器组合模型的垃圾邮件过滤 被引量:2

Spam Filter Based on Multiple Classifier Combinational Model
在线阅读 下载PDF
导出
摘要 针对垃圾邮件过滤中代价不对等问题,即正常邮件被误判为垃圾邮件的代价远大于垃圾邮件被误判为正常邮件,构建一种使用2层结构的组合分类器框架。对样本邮件进行预处理,使文本特征和行为特征相结合。在提高单分类器性能的基础上,对不同分类器进行组合优化,并通过反馈及时调整模型,实现高效的自学习功能。 Aiming at the unequal cost problem of spare filter that the cost of ham misclassification is much more than the cost of spam misclassification, this paper proposes a combinational classifier with two-layer structure. Email samples are pre-processed. The filter combines the behavioral features and text features, and optimizes the combination of different classifiers based on improving the performance of a single one. The classifier adjusts the model by timely feedback to make the filter obtain efficient self-learning function.
出处 《计算机工程》 CAS CSCD 北大核心 2010年第18期194-196,共3页 Computer Engineering
基金 国家"863"计划基金资助项目(2007AA01Z197)
关键词 垃圾邮件过滤 组合分类器 2层结构 比特熵 误判率 spam filter combinational classifier two-layer structure bit entropy false positive rate
  • 相关文献

参考文献11

  • 1李睿,李伟娟,李明.基于加权量子粒子群的分类器设计[J].计算机工程,2010,36(7):203-204. 被引量:2
  • 2Sahami M,Dumais S,Heckerman D,et al.A Bayesian Approach to Filtering Junk E-mail[C] //Proceedings of the AAAI Workshop on Learning for Text Categorization.Madison,Wisconsin,USA:[s.n.] ,1998.
  • 3Lee Weesun,Liu Bing.Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression[C] //Proc.of the 20th International Conference on Machine Learning.Washington D.C.,USA:[s.n.] ,2003:448-455.
  • 4Bratko A,Filipi(c) B,Cormack G V,et al.Spam Filtering Using Statistical Data Compression Models[J].Machine Learning Research,2006,7:2673-2698.
  • 5Littlestone N.Learning Quickly When Irrelevant Attributes Abound:A New Linear-threshold Algorithm[J].Machine Learning,1988,2(4):285-318.
  • 6Hershkop S,Stolfo J.Combining Email Models for False Positive Reduction[C] //Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago,Illinois,USA:[s.n.] ,2005:98-107.
  • 7Segal R,Crawford J,Kephart J,et al.SpamGuru:An Enterprise Anti-spam Filtering System[C] //Proceedings of the 1st Conference on Email and Anti-spam.California,USA:[s.n.] ,2004.
  • 8Li Yang,Fang Binxing,Li Guo.A Novel Online Spam Filter Based on URLs and Maximum Entropy Model[EB/OL].[2010-01-09].] http://www.ict.ac.cn/grope/down/07-09/1189134311.doc.
  • 9Lin Chih-jen,Weng R C,Keerthi S S.Trust Region Newton Method for Large-scale Logistic Regression[C] //Proceedings of the 24th International Conference on Machine Learning.Corvalis,Oregon,USA:[s.n.] ,2007:561-568.
  • 10Howard P G The Design and Analysis of Efficient Lossless Data Compression Systems[D].Rhode Island,USA:Brown University,1993.

二级参考文献6

共引文献1

同被引文献18

  • 1Mitchell T M. Machine learning [ M ]. Columbus : The McGraw-Hill Companies Inc, 1997.
  • 2Nigam K, McCallum A, Thrun S, et al. Text classification from labeled and unlabelled documents using EM [ J ]. Machine Learning, 2000, 39(2-3) :103-134.
  • 3Rennie J, MeCaUum A. Using reinforcement learning to spider the Web efficiently [ C ]//Bratko I, Dzeroski S. Proceedings of the Sixteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc, 1999:335 -343.
  • 4Diligenti M, Coetzee F, Lawrence S. Focused crawling using context graphs[ C]//Abbadi A E, Brodie M L, Chakravarthy S, et al. Proceedings of the 26th VLDB Conference. San Francisco: Morgan Kaufmann Publishers Inc, 2000:527 -534.
  • 5Chakrabarti S, Punera K, Subramanyam M. Accelerated focused crawling through online relevance feedback [ C ]//Proceedings of the 11 th International Conference on World Wide Web. New York : ACM ,2002 : 148 - 159.
  • 6Johnson J, Tsioutsiouliklis K, Giles C L. Evolving strategies for focused Web crawling [ C ]//Fawcett T, Mishra N. Proceedings of the Twentieth Intemational Conference. Washington D. C: AAAI Press, 2003:298 -305.
  • 7Pant G, Srinivasan P. Link contexts in classifier-guided topical crawlers [ J ]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18 ( 1 ) : 107 - 122.
  • 8Pant G, Srinivasan P. Learning to craM:Comparing classification schemes [ J ]. ACM Trans Information Systems, 2005,23 (4) :430- 462.
  • 9杨建良,王永成.自动分类技术的发展与展望[EB/OL].[2013-06-06].http://www.cnindex.fudan.edu.cn/zgsy/2003nl/zidongfenlei.htm.
  • 10Machine learning group at university of waikato. Weka 3: Data mining software in java E EB/OL 1. [ 2013 - 06 - 06 ]. http :// www. cs. waikato, ac. nz/ml/weka./.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部