期刊文献+

基于条件随机域的复杂最长名词短语识别 被引量:16

Recognition of Complex Maximal Length Noun Phrase Using Conditional Random Fields
在线阅读 下载PDF
导出
摘要 识别句子中的最长名词短语是一个对机器翻译等任务具有重要实际价值的难题.为了克服传统方法在处理词之间的长程关联的不足和标注偏置等问题,本文采用条件随机域建立统计模型,有针对性的研究了复杂最长名词短语的识别,并给出了一种带置信度估计的解码算法,提高了本文工作的实用性. The recognition of Chinese maximal-length noun phrase is a difficult task, which is valuable for many applications such as machine translation. To overcome the deficiency in capturing the long distance relationship between words and label bias with the traditional methods, a statistical model based on conditional random field is constructed with the focus on the complex maximal length noun phrases. And a decoding algorithm with confidence estimation is given, which is proved to be effective for enhancing the practical usability.
出处 《小型微型计算机系统》 CSCD 北大核心 2006年第6期1134-1139,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60272088)资助 国家"八六三"基金项目(2002AA11401)资助.
关键词 最长名词短语 条件随机域 机器翻译 maximal-length noun phrase conditional random fields machine translation
  • 相关文献

参考文献19

  • 1Takao Doi, Eiichiro Sumita. Input sentence splitting and translation[C].HLT-NAACL 2003 Workshops, Building and Using Parallel Texts Data Driven Machine Translation and Beyond,Edmonton, 2003, 104-110.
  • 2Young-Ae Seo, Yoon-Hyung Roh, Ki-Young Lee, Sang-kyu Park. CaptionEye/EK: english-to-korean caption translation system using the sentence pattern [C]. MTSUMMIT-2001,2001.
  • 3Sado Kurohashi, Makoto Nagao.A syntactic analysis method of long japanese sentences based on the dection of conjunctive structures [J].Computational Linguistics, 1994, 20 (4):507-534.
  • 4Didier Bourigault. Surface grammatical analysis for the extraction of terminological noun phrase[C]. In:Proceeding of COLING-92. 1992, 977-981.
  • 5Atro Voutilainen. NPtool: a detector of english noun phrases[C]. In:Proceedings of Workshop on Very Large Corpra: Academic and Industrial Perspectives, 1993, 48-57.
  • 6Chen Kuang-hua, Chen Hsin-Hsi.Extracting noun phrases from large-scale texts: a hybrid approach and its automatic evaluation, 1994.
  • 7Li Wen-jie, Pan Hai-hua, Zhou Ming et al. Are statistics-based approaches good enough for NLP: a case study of maximal-length NP extraction in chinese[C]. In: Proc. of ROCLING-95, Taipei, 1995,137-152.
  • 8周强,孙茂松,黄昌宁.汉语最长名词短语的自动识别[J].软件学报,2000,11(2):195-201. 被引量:37
  • 9Thomas G. Dietterich, Machine learning for sequential data: a review[A]. In:T. Caelli (Ed.) Lecture Notes in Computer Science[M]. Springer-Verlag, 2002.
  • 10John D. Lafferty, Andrew MeCallum, Fernando C. N.Pereira, Conditional random fields:probabilistic models for segmenting and labeling sequence data[C]. In:Proceedings of the Eighteenth International Conference on Machine Learning,June 28-July 01, 2001:282-289.

二级参考文献3

共引文献47

同被引文献124

引证文献16

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部