摘要
本文介绍一种基于句法分析和格式语义结构,被称为“语义矢量空间模式”的文献自动标引/检索技术。在此模式中,自然语言文献和检索提问均表示为语义矩阵。通过计算语义矩阵的相似值,检索系统可以预测文献与给定提问之间的相关度,从而达到检索相关文献的目的。初步试验结果表明,若文献及检索提问较长,特别是以原文献作为提问样本时,此检索技术与康奈尔大学的SMART系统相比,在检全率。
This paper presents the semantic vector space model, a text representation and searching technique based on heuristic syntax parsing and distributed representation of semantic case structures. In this model, documents and queries are represented as semantic matrices, and relevancy prediction is achieved by computing the similarity of such matrices. A prototype system was built to implement this model and used in an experimental study. The preliminary results of the study showed that with longer documents and queries, especially when original documents were used as queries, the system based on our technique had significantly better performance than the VSM based SMART system in terms of recall, precision, and effectiveness of relevance ranking.
出处
《情报学报》
CSSCI
北大核心
1996年第6期402-413,共12页
Journal of the China Society for Scientific and Technical Information