摘要
为了提高信息检索效率,在中文Web信息检索中引入了浅层文本分析技术。首先提取文本句子的谓词及与谓词直接关联的前置体词和后继体词。然后在将谓词转换成概念化表达的基础上,获取表达文本语义的语义向量。提出了一个语义向量相似度计算算法,用语义向量的相似度来度量文档之间的语义相似度。与主流网络搜索引擎比较,系统查准率方面有了较大提高。
To improve the retrieval performance,shallow parsing technique for text was introduced for Chinese web information retrieval.Firstly,predicate,prepositive nominal and succedent nominal close to the predicate were extracted from Chinese sentence.Then,semantic vector of Chinese text was acquired based on converting predicate and nominal to conception.An algorithm was presented for similarity calculating of semantic vector.Similarity of semantic vector measures semantic similarity between documents.Experimenta...
出处
《杭州电子科技大学学报(自然科学版)》
2008年第1期48-51,共4页
Journal of Hangzhou Dianzi University:Natural Sciences
基金
浙江省自然科学基金资助项目(M603025)
关键词
中文信息处理
浅层文本分析
信息检索
语义检索
相似度计算
Chinese information processing
shallow parsing for text
information retrieval
semantic-based retrieval
similarity calculating