摘要
识别句子中的最长名词短语是一个对机器翻译等任务具有重要实际价值的难题.为了克服传统方法在处理词之间的长程关联的不足和标注偏置等问题,本文采用条件随机域建立统计模型,有针对性的研究了复杂最长名词短语的识别,并给出了一种带置信度估计的解码算法,提高了本文工作的实用性.
The recognition of Chinese maximal-length noun phrase is a difficult task, which is valuable for many applications such as machine translation. To overcome the deficiency in capturing the long distance relationship between words and label bias with the traditional methods, a statistical model based on conditional random field is constructed with the focus on the complex maximal length noun phrases. And a decoding algorithm with confidence estimation is given, which is proved to be effective for enhancing the practical usability.
出处
《小型微型计算机系统》
CSCD
北大核心
2006年第6期1134-1139,共6页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(60272088)资助
国家"八六三"基金项目(2002AA11401)资助.
关键词
最长名词短语
条件随机域
机器翻译
maximal-length noun phrase
conditional random fields
machine translation