期刊文献+

基于词汇吸引与排斥模型的共现词提取 被引量:8

Co-occurrence Word Retrieval Based on the Lexical Attraction and Repulsion Model
在线阅读 下载PDF
导出
摘要 共现词提取在信息挖掘和自然语言处理中有着十分重要的地位。而传统的共现词提取方法仅仅局限在单一的一种统计量上 ,其结果十分不精确 ,需要人工再进行整理。本文提出了一种基于词汇吸引与排斥模型的共现词提取算法 ,并通过将多种常用统计量进行组合 ,改进了算法的效果。在开放测试环境下 ,所提取的共现词其用户感兴趣度为 6 0 87%。将该算法应用于基于Web的共现词检索系统 。 Co-occurrence word retrieval is very important in information mining and natural language processing. But traditional co-occurrence word retrieval methods used only a single statistic method, so the result is very imprecise, and needs lots of manual collation. In this paper we present a co-occurrence words extraction algorithm based on the lexical attraction and repulsion model, and combine some common statistical methods with the algorithm to improve its effect. In the open test, our system's Interesting performance is 60.87%. We show good performance in speed and precision when applied the algorithm on a co-occurrence search system based on web.
出处 《中文信息学报》 CSCD 北大核心 2004年第6期16-22,共7页 Journal of Chinese Information Processing
基金 福建省自然科学基金资助项目 (A0 310 0 0 9) 福建省重点科技资助项目 (2 0 0 1J0 0 5 )
关键词 计算机应用 中文信息处理 共现词 词汇吸引与排斥模型 共现距离 computer application Chinese information processing co-occurrence lexical attraction and repulsion model co-occurrence distance
  • 相关文献

参考文献5

  • 1Ying Ding, IR and AI. Using Co - occurrence Theory to Generate Lightweight Ontologies[A]. Proceedings of 12th International Workshop on Database and Expert Systems Applications[C], Pages:961 -965 , Sept.,2001.
  • 2E1-Sayed Atlam, A New Method for Construction Field Association Terms Using Co-occurrence Words and Declinable Words Information[A]. Proceedings of 2002 IEEE Intemational Conference on Systems, Man and Cybernetics[C],Volume 4 ,Pages:5, Oct. 2002 .
  • 3Yuen-Hsien Tseng, Fast Co-occurrence Thesaurus Construction for Chinese News[A]. Proceedings of 2001 IEEE International Conference on Systems, Man, and Cybernetics[C], Volume 2, Pages:853- 858, Oct. 2001.
  • 4Doug Beeferman, Adam Berger, John Lafferty. A Model of Lexical Attraction and Repulsion[A]. Proceedings of the35th Annual Meeting of the Association for Computational Linguistics. [C], Pages: 373- 380, 1997.
  • 5Ido Dagan, Shaul Marcus. Contextual word similarity and estimation from sparse data[J]. Computer Speech and Language, Vol. 9, Pages: 123 - 152,1995.9.

同被引文献115

引证文献8

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部