期刊文献+

基于浅层文本分析的中文Web信息检索 被引量:1

Chinese Web Information Retrieval System Based on Shallow Parsing
在线阅读 下载PDF
导出
摘要 为了提高信息检索效率,在中文Web信息检索中引入了浅层文本分析技术。首先提取文本句子的谓词及与谓词直接关联的前置体词和后继体词。然后在将谓词转换成概念化表达的基础上,获取表达文本语义的语义向量。提出了一个语义向量相似度计算算法,用语义向量的相似度来度量文档之间的语义相似度。与主流网络搜索引擎比较,系统查准率方面有了较大提高。 To improve the retrieval performance,shallow parsing technique for text was introduced for Chinese web information retrieval.Firstly,predicate,prepositive nominal and succedent nominal close to the predicate were extracted from Chinese sentence.Then,semantic vector of Chinese text was acquired based on converting predicate and nominal to conception.An algorithm was presented for similarity calculating of semantic vector.Similarity of semantic vector measures semantic similarity between documents.Experimenta...
出处 《杭州电子科技大学学报(自然科学版)》 2008年第1期48-51,共4页 Journal of Hangzhou Dianzi University:Natural Sciences
基金 浙江省自然科学基金资助项目(M603025)
关键词 中文信息处理 浅层文本分析 信息检索 语义检索 相似度计算 Chinese information processing shallow parsing for text information retrieval semantic-based retrieval similarity calculating
  • 相关文献

参考文献7

  • 1[1]Salton G,Fox E A,H Wu.Extended Boolean Information Retrieval[J].Communications of the ACM,1983,26(12):1022-1036.
  • 2[2]Salton G.Introduction to Modern Information Retrieval[M].Boston:McGraw-Hill,1983:36-89.
  • 3[3]Crestani F,Rijsbergen C J.A study of probability kinematics in information retrieval[J].ACM Transactions on Information Systems,1998,16(3):225-255.
  • 4[4]Deerwester S,Dumais S T A.Indexing by Latent Semantic Analysis[J].Journal of the Society for Information Science,1990,41(6):391-407.
  • 5[5]Fung R,Favero B Del.Applying Bayesian Networks to Information Retrieval[J].Communications of the ACM,1995,38(3):42-57.
  • 6孙宏林,俞士汶.浅层句法分析方法概述[J].当代语言学,2000,2(2):74-83. 被引量:39
  • 7梅家驹.同义词词林[M].上海:上海辞书出版社,1989..

二级参考文献29

  • 1周强.一个汉语短语自动界定模型[J].软件学报,1996,7(A00):315-322. 被引量:9
  • 2Abney, 1996b. Partial parsing via finite-state cascades. In Proceedings of the ESSLLI '96 Robust Parsing Workshop.
  • 3Argamon, S., I. Dagon and Y. Krymolowsky. 1998. A memory-based approach to learning shallow natural language patterns. In Proceedings of COLING-ACL '98. Pp. 67-73.
  • 4Brill, Eric. 1995. Unsupervised learning of Disambiguation Rules for part of speech tagging. In Proceedings of the 3rd Workshop on Very Large Corpora. Pp. 1-13.
  • 5Cardie, Claire and David Pierce. 1998. Error-driven pruning of treebank grammars for base noun phrase identification. In Proceedings of COLING-ACL '98. Pp. 218-224.
  • 6Chen, Kuang-hua and Chen, Hsin-Hsi. 1994. Extracting noun phrases from large-scale texts: a hybrid approach and its automatic evaluation. In Proceedings of the 32nd Annual Meeting of the Association for Computational binguistics. Pp. 234-241.
  • 7Chen, Hsin-Hsi and Lee, Yue-Shi. 1995. Development of a partially bracketed corpus with part-of- speech information only. In Proceedings of the 3rd Workshop on Very Large Corpora. Pp. 162-172.
  • 8Church, K. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing. Pp. 136-143.
  • 9Collins, M. 1996. A new statistical parser based on bigram lexical dependencies. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. Pp. 184-191.
  • 10Fano, R. M. 1961. Transmission of lnformation, A Statistical Theory of Communication. MIT Press.

共引文献42

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部