摘要
长期以来,词义排歧一直被认为是自然语言处理的难题之一。本文用机器可读词典《现代汉语辞海》提供的搭配实例作为多义词的初始搭配知识,采用适当的统计和自组织方法自动扩大搭配集;为保证学习质量,在学习过程中逐渐增大上下文窗口的长度;提出使用搭配统计表的多元最大对数似然比词义排歧算法。最后,对本文提出的方法进行了实验,实验表明这种算法具有较高的正确率。
Word sense disambiguation has been a difficult problem
in natural language processing. This paper presents a method of automatically increasing new
collocations by the use of the collocations provided by a machine readable dictionary
XianDaiHanYuCiHai; In order to assuring the learning quality, the size of context was enlarged
gradually; In the procedure of learning and word sense disambiguating, author gives a multi
maximal log word sense disambiguation algorithm. At last, the method was tested and proved
that it has higner accurancy.
出处
《中文信息学报》
CSCD
北大核心
1999年第3期1-8,共8页
Journal of Chinese Information Processing
基金
国家自然科学基金
关键词
自然语言处理
词义排歧
自组织方法
汉语
natural language processingword sense
disambigautionadaptive methodcollocation