期刊文献+

基于VSM的权重改进文档相似度算法研究 被引量:9

Research of Documents Similarity Influence Based on the Improved VSM Weight
在线阅读 下载PDF
导出
摘要 向量空间模型是以索引项权重为核心的模型,索引项权重对文本分类、检索的效果起着决定性的作用。文中提出了一个基于关键词的权重改进传统向量空间模型的权重算法。基于改进索引项权重的向量空间模型除了考虑原有索引项权重还考虑了文档中关键词的权重。通过特定领域FAQ的检索测试结果表明,改进的算法很大程度上提高了检索的查准率、查全率。 The index weight is the core in vector space model.It plays a decisive role for the results of the text classification and the text retrieval.This paper presents a new weight based on the keyword,so as to improve the traditional weight formula of VSM.The new improving VSM combines the original weight;moreover,it takes the keyword weight into consideration.With the test based on the special domain FAQ,the results show that the improved method greatly raised the retrieval precision and recall.
出处 《软件》 2012年第10期103-105,共3页 Software
关键词 向量空间模型 关键词权重 查准率 查全率 Vector Space Model keyword weight precision recall
  • 相关文献

参考文献6

二级参考文献16

  • 1黄萱青 吴立德.独立于语种的文本分类方法[M].,2000.37-43.
  • 2鲁松 白硕 等.文本中词语权重计算方法的改进[M].,2000.31-36.
  • 3卜东波.聚类/分类理论研究及其在大模型文本挖掘的应用:博士论文[M].,2000..
  • 4Salton, G. and Buckley, C. Term weighting approaches in automatictext retrieval. Information Processing and Management, 24(5):513-523. (1988).
  • 5Fuhr, N. Probabilistic models in information retrieval. The Computer Journal, 35(3): 243-255. (1992).
  • 6Ogawa, Y., Morita, T., and Kobayashi, K. A fuzzy document retrieval system using the keyword connection matrix and a learning method. Fuzzy Sets and Systems,39:163-179. (1991).
  • 7Salton, G., Fox, E., and Wu, H. Extended boolean information retrieval. Communications of the ACM, 26(11): 1022-1036. (1983).
  • 8Zadeh, L. Readings in Fuzzy sets for intelligent systems, chapter Fuzzy sets.Morgan Kaufmann. (1993).
  • 9Wong, S., Ziarko, W., and Wong, P. Generalized vector space model in information retrieval. In Proceedings of the 8th ACM SIGIR Conference on Research and Development in Information Retrieval, pages 18-25, New York, USA. (1985).
  • 10Furnas, G., Deerwester, S., Dumais, S., Landauer, T., Harshman, R., Streeter,L., and Lochbaum, K.. Information retrieval using a singular value decomposition model of latent semantic structure. In Proceedings of the llth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 465-480. (1988).

共引文献299

同被引文献62

引证文献9

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部