期刊文献+

一种最大分类间隔SVDD的多类文本分类算法 被引量:2

A Multi-class Text Categorization Algorithm Based on Maximal Classification Margin SVDD
在线阅读 下载PDF
导出
摘要 文本分类是信息检索和文本挖掘的关键技术之一。提出了一种基于支持向量数据描述(SVDD)的多类文本分类算法,用支持向量描述训练求得包围各类样本的最小超球体,并使得分类间隔最大化,在测试阶段,引入基于核空间k-近邻平均距离的判别准则,判断样本所属类别。实验结果表明,该方法具有很好的泛化能力和很好的时间性能。 Text categorization is one of the key technology to retrieve information and mine text. This paper proposes a multi-class text categorization algorithm based on maximal classification margin SVDD( Support Vector Data Description) . This algorithm trains multi-class samples with support vector data description, then computes a minimal super spherical structure which can surround all samples and has maximal margin between each class. In the phase of testing,this algorithm classifies samples with a criterion of average dis-tance based on KNN( K-Nearest Neighbor) . The test result shows this algorithm has good generalization capability and good time efficiency of text categorization.
作者 罗琦
出处 《电讯技术》 北大核心 2014年第4期496-499,共4页 Telecommunication Engineering
关键词 信息检索 文本挖掘 文本分类 支持向量数据描述 多类分类器 information retrieving text mining text categorization support vector data description(SVD) multi-class classifier
  • 相关文献

参考文献9

  • 1Sebastiani F. Machine learning in automated text categoriza-tion [J]. ACM Computing Surveys, 2002,34(1):1-47.
  • 2Lewis D D, Yang Y, Rose T, et al. Rcv1: A NewBenchmark Collection for Text Categorization Research[J]. Journal of Machine Learning Research, 2004( 5):361-397.
  • 3Joachims T. Text categorization with support vector ma-chines: Learning with many relevant features[M] / / Pro-ceedings of the 10th European Conference on MachineLearning. New York:Springer-Verlag,1998:137-142.
  • 4邓乃扬,田英杰. 支持向量机-理论、算法与拓展[M].北京:科学出版社, 2009.
  • 5Tax D M, Duin R P W. Support vector data description[J]. Machine Learning, 2004, 54(1): 45-66.
  • 6Hao P Y, Chiang J H, Lin Y H. A new maximal marginspherical structured multi-class support vector machine[J]. Applied Intelligence, 2009, 30(2): 98-111.
  • 7Salton G, Wang A, Yang C S. A vector space model forautomatic indexing [J]. Communication of the ACM,1975, 18(11):613-620.
  • 8Manevitz L,Yousef M. One class SVMs for documentclassification [J]. Journal of Machine Learning Re-search, 2002(2):139-154.
  • 9Bennett P N, Dumais S T, Horvitz E. The combination oftext classifiers using reliability indicators [J]. Informa-tion Retrieval, 2005,8(1): 67-100.

同被引文献18

  • 1李峰,李芳.中文词语语义相似度计算——基于《知网》2000[J].中文信息学报,2007,21(3):99-105. 被引量:106
  • 2Cui W D,Kannan J,Wang H J. Discoverer:automaticprotocol reverse engineering from network traces[C] / /Proceedings of 16th USENIX Security Symposium onUSENIX Security Symposium. Berkeley,CA,USA:IEEE,2007:1-14.
  • 3Caballero J,Yin H,Liang Z K,et al. Polyglot:automaticextraction of protocol message format using dynamic bina-ry analysis[C] / / Proceedings of the 14th ACM conferenceon Computer and communications security. Alexandria,VA:IEEE,2007:1-5.
  • 4Hsu Y,Shu G Q,Lee D. A model-based approach to securi-ty flaw detection of network protocol implementations[C]/ /Proceedings of the 15th IEEE International Conference onNetwork Protocols. Orlando,FL:IEEE,2008:114-123.
  • 5Xiao M M,Yu S Z,Wang Y. Automatic network protocolautomaton extraction[C] / / Proceedings of the 3rd Inter-national Conference on Network and System Security.Gold Coast,QLD:IEEE,2009:336-343.
  • 6Wang Y P,Zhang Z B,Yao D F,et al. Inferring ProtocolState Machine from Network Traces:A Probabilistic Ap-proach[C] / / Proceedings of the 9th Applied Cryptographyand Network Security International Conference. Nerja,Spain:IEEE,2011:1-18.
  • 7Trifilo A,Burschka S,Biersack E. Traffic to protocol re-verse engineering [C]/ / Proceedings of the 2009 IEEESymposium on Computational Intelligence in Security andDefense Applications. Ottawa,ON:IEEE,2009:257-264.
  • 8Shevertalov M,Mancoridis S. A reverse engineering toolfor extracting protocols of networked applications[C] / /Proceedings of the 14th Working Conference on ReverseEngineering. Vancouver,BC:IEEE,2007:229-238.
  • 9赵臻,吴宁,宋盼盼.基于多特征融合的句子语义相似度计算[J].计算机工程,2012,38(1):171-173. 被引量:18
  • 10范云杰,刘怀亮.基于维基百科的中文短文本分类研究[J].现代图书情报技术,2012(3):47-52. 被引量:34

引证文献2

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部