摘要
向量空间模型是以索引项权重为核心的模型,索引项权重对文本分类、检索的效果起着决定性的作用。文中提出了一个基于关键词的权重改进传统向量空间模型的权重算法。基于改进索引项权重的向量空间模型除了考虑原有索引项权重还考虑了文档中关键词的权重。通过特定领域FAQ的检索测试结果表明,改进的算法很大程度上提高了检索的查准率、查全率。
The index weight is the core in vector space model.It plays a decisive role for the results of the text classification and the text retrieval.This paper presents a new weight based on the keyword,so as to improve the traditional weight formula of VSM.The new improving VSM combines the original weight;moreover,it takes the keyword weight into consideration.With the test based on the special domain FAQ,the results show that the improved method greatly raised the retrieval precision and recall.
出处
《软件》
2012年第10期103-105,共3页
Software