期刊文献+

基于SVM的中文网页分类方法的研究 被引量:22

Study on Chinese web page classification based on SVM
在线阅读 下载PDF
导出
摘要 中文网页分类技术是数据挖掘中一个研究热点领域,而支持向量机(SVM)是一种高效的分类识别方法,在解决高维模式识别问题中表现出许多特有的优势。提出了基于支持向量机的中文网页分类方法,其中包括对该过程中的网页文本预处理、特征提取和多分类算法等关键技术的介绍。实验表明,该方法训练数据规模大大减少,训练效率较高,同时具有较好的精确率和召回率。 Chinese web page classification has been considered as a hot research area in data mining. SVM is an effective method for learning the classification knowledge from massive data, especially in the situation of high cost in getting labeled classical examples. Based on the analyses of features of Chinese web pages, A Chinese web page classification algorithm based on SVM is presented to effectively organize the rich information on the Internet, including the important aspects of text preprocessing, feature selection and multiple-class algorithm. The experiments show that it not only reduces the size of train set, but also has very high training efficiency. Its precision and recall are also very good.
出处 《计算机工程与设计》 CSCD 北大核心 2007年第8期1893-1895,共3页 Computer Engineering and Design
基金 中国矿业大学青年科研基金项目(OD4490)
关键词 支持向量机 特征提取 核函数 网页 文本分类 support vector machine feature selection kernel function web page text classification
  • 相关文献

参考文献9

二级参考文献42

  • 1黄萱菁,吴立德,王文欣,叶丹瑾.基于机器学习的无需人工编制词典的切词系统[J].模式识别与人工智能,1996,9(4):297-303. 被引量:24
  • 2黄萱青 吴立德.独立于语种的文本分类方法[M].,2000.37-43.
  • 3鲁松 白硕 等.文本中词语权重计算方法的改进[M].,2000.31-36.
  • 4卜东波.聚类/分类理论研究及其在大模型文本挖掘的应用:博士论文[M].,2000..
  • 5VAPNIK V N. The nature of statistical learning [M].Berlin:Springer, 1995.
  • 6VAPNIK V N. Statistical learning theory [M]. New York:John Wiley & Sons, 1998.
  • 7SCHōLKOPH B, SMOLA A J, BARTLETT P L. New support vector algorithms[J]. Neural Computation.2000, 12(5):1207--1245.
  • 8SUYKENS J A K, VANDEWALE J. Least squares support vector machine classifiers[J]. Neural Processing Letters, 1999, 9(3): 293--300.
  • 9CHEW H-G, BOGNER R E, LIM C-C, Dual v-support vector machine with error rate and training size beasing[A]. Proceedings of 2001 IEEE Int Conf on Acoustics,Speech, and Signal Processing [C]. Salt Lake City,USA: IEEE, 2001. 1269--1272.
  • 10LIN C-F, WANG S-D. Fuzzy support vector machines[J]. IEEE Trans on Neural Networks, 2002, 13(2):464--471.

共引文献2733

同被引文献157

引证文献22

二级引证文献149

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部