多策略机器翻译系统IHSMTS中候选实例模式检索算法被引量：2

Retrieval Approach of Candidate Translation Examples in Interactive Hybrid Strategies Machine Translation System IHSMTS

下载PDF

导出

摘要基于实例的机器翻译系统EBMT都需要有一个非常大的实例模式库,其数量级通常在百万句对以上.因此,如何从中快速地选择出一定数量的与待翻译的输入句子比较相似的候选实例,提供给后续句子相似度计算、类比译文构造等模块作进一步的处理,是EBMT系统所必须解决的一大难题.文章基于句子的词表层特征和信息熵提出了一种多层次候选实例模式检索算法,通过在多策略机器翻译系统IHSMTS上的运行测试,结果表明该算法较好的解决了这一难题. EBMT system often requires a large corpus of translation examples which is on the order of millions sentence pairs. So the difficulty how to fast and effectively retrieve an amount of candidate translation examples which are useful for latter translation by analogy reasoning from the corpora must be resolved for any application EBMT system. In this paper, a multi-layer retrieval approach of candidate translation examples is proposed based on word surface features and word entropy. The approach is tested on an application MT system, and the test result show that the approach effectively resolves the problem of the retrieval of candidate translation examples.

作者张孝飞陈肇雄黄河燕代六玲

机构地区中国科技大学计算机系中国科学院计算机语言信息工程研究中心

出处《小型微型计算机系统》 CSCD 北大核心 2005年第3期330-334,共5页 Journal of Chinese Computer Systems

基金国家自然科学基金项目(60272088)资助.

关键词基于实例的机器翻译实例模式库候选实例词表层特征信息熵 EBMT corpora of translation example candidate translation example word surface features entropy

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1陈肇雄,高庆狮.智能化英汉机译系统IMT/EC[J].中国科学（A辑）,1989,20(2):186-194. 被引量：16
2Maruyama H and Watanabe H. Tree cover search algorithm for example-based translation[C]. In Proceeding of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation(TMI-92), 173-184, Montreal, 1992.
3Sergei Nirenburg. The pangloss mark Ⅲ machine translation system[R]. Joint Technical Report, Computing Research Laboratory (New Mexico State University), Center for Machine Translation (Carnegie Mellon University), Information Sciences Institute (University of Southern California). Issued as CMU technical report CMU-CMT-95-145.
4Ralf D Brown. Example-based machine translation in the pangloss system[C]. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), 169-174. Copenhagen, Denmark, August 5-9, 1996.
5Zhang Ying, Ralf D Brown Robert E Frederking. Adapting an example-based translation system to Chinese[C]. In Proceedings of HLT 2001: First International Conference on Human Language Technology Research, 7-10. San Diego, California, March 18-21, 2001.
6黄河燕陈肇雄宋继平.一种人机互动的多策略机器翻译系统IHSMTS的设计与实现原理[C]..第一届机器翻译与计算机语言信息处理国际研讨会[C].北京,1999．7.26-28:270-276.
7Keiji Yasuda, Fumiali Suagya,et al. An automatic evaluation method of translation quality using translation answer candidates queried from a Paralledl Corpus, MT Summit's conference, Santiago de Compostela, 2001.
8Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita. Using multiple edit distances to automatically rank machine translation output[C]. MT Summit's conference, Santiago de Compostela, 2001.
9Yao Jian-min, Zhou Ming et al. An automatic evaluation method for localization oriented lexicalised EBMT system[C]. the 19th International Confernce on Computational Linguistics (COLING2002), Taipei, 2002.
10黄河燕陈肇雄.基于多策略的交互式智能辅助翻译平台总体设计[A].黄河燕主编.机器翻译研究进展[M].北京:电子工业出版社,2002年11月.137-146.

共引文献17

1黄河燕,陈肇雄.基于规则的德汉机器翻译词法分析算法[J].应用基础与工程科学学报,1995,3(3):99-104. 被引量：1
2李玉鉴.基于索引模板匹配替换通用算法的机器翻译[J].计算机应用研究,2004,21(5):54-57. 被引量：1
3黄河燕,陈肇雄.基于多策略的交互式智能辅助翻译平台总体设计[J].计算机研究与发展,2004,41(7):1266-1272. 被引量：12
4张孝飞,陈肇雄,黄河燕,胡春玲.多策略机器翻译系统IHSMTS中实例模式泛化匹配算法[J].中文信息学报,2005,19(4):1-9. 被引量：1
5张长慧.基于ATN的全自动英汉机译系统的建立与研究[J].西南交通大学学报,1995,30(2):225-229.
6李玉鑑.英汉翻译模板的标准化方案及其应用[J].中文信息学报,2006,20(B03):41-46.
7黄河燕,陈肇雄,张孝飞,张克亮.大规模句子相似度计算方法[J].中文信息学报,2006,20(B03):47-52. 被引量：6
8周洪,陈强.基于SC文法的英汉机译中消歧处理的研究[J].北京联合大学学报,1997,11(3):1-8.
9魏茂盛,章森,张冯厚.机器翻译系统的交互处理机制[J].山东建材学院学报,1998,12(2):149-153.
10陈肇雄,黄河燕,宋今.多元信息流输入识别与处理系统MIIRPS[J].世界科技研究与发展,1998,20(3):100-105.

同被引文献16

1张敏,马少平,宋睿华.DF还是IDF?主特征模型在Web信息检索中的使用[J].软件学报,2005,16(5):1012-1020. 被引量：13
2刘康龙,穆雷.语料库语言学与翻译研究[J].中国翻译,2006,27(1):59-64. 被引量：48
3李栋,史晓东.一种支持高效检索的实时更新倒排索引策略[J].情报学报,2006,25(1):16-20. 被引量：6
4M.Nagao.A framework of a mechanical translation between Japanese and English by analogy principle[C]//A.Elithorn and R.Bane~i.Artificial and Human Intelligence.North Holland Publications.1984:173-180.
5Carl M.Recent Research in the Field of Examplebased Machine Translation[C]//Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing,2001:195-196.
6Kanghua Chen.Indexing and abstracting:lecture 10 index structure[R].Department of Lib rary and Information Science National Taiwan University,2005.
7Kraaij W,Nie J Y,Simard M.Embedding Web-based Statistical Translation Models in Cross-language Information Retrieval[J].Computational Linguistics,2003,29(3):381-419.
8Lee K S,Kageura K,Choi K S.Implicit Ambiguity Resolution Using Incremental Clustering in Korean-to-English Cross-language Information Retrieval[C]//Proc.of the 17^th International Conference on Computational Linguistics.2002.
9Darwish K,Hassan H,Emam O.Examining the Effect of Improved Context Sensitive Morphology on Arabic Information Retrieval[C]// Proc.of the ACL Workshop on Computational Approaches to Semitic Languages,University of Michigan Ann Arbor,Michigan,USA.2005-06-29:25-30.
10Sadat F,Yoshikawa M,Uemura S.Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval[C]//Proc.of the 1^st Annual Meeting of the Association for Computational Linguistics,Sapporo,Japan.2003-07.

引证文献2

1张孝飞,黄河燕,陈肇雄,代六玲.跨语言信息检索中查询语句翻译转换算法[J].计算机工程,2007,33(11):166-167. 被引量：1
2田生伟,吐尔根.依布拉音,禹龙.EBMT中高效的维吾尔语单词散列表构造算法[J].中文信息学报,2009,23(4):124-128. 被引量：3

二级引证文献4

1屈鹏,李璐,张丽丽.情报检索发展的几个前沿问题[J].图书情报工作,2008,52(3):19-24. 被引量：6
2才让加.面向自然语言处理的大规模汉藏(藏汉)双语语料库构建技术研究[J].中文信息学报,2011,25(6):157-161. 被引量：18
3麦热哈巴.艾力,姜文斌,吐尔根.依布拉音.维吾尔语词法中音变现象的自动还原模型[J].中文信息学报,2012,26(1):91-96. 被引量：8
4哈里旦木·阿布都克里木,侯钰涛,姚登峰,阿布都克力木·阿布力孜,陈吉尚.维吾尔语机器翻译研究综述[J].计算机工程,2024,50(1):1-16. 被引量：2

1钟金琴,辜丽川.一种面向对象的软件设计模式库的设计[J].计算机技术与发展,2008,18(9):22-25. 被引量：9
2陈功,黄瑞章,钟文良.基于社交特征的多维度文本表示方法[J].计算机工程与科学,2016,38(11):2348-2355. 被引量：3
3田生伟,吐尔根.依布拉音,禹龙.EBMT中高效的维吾尔语单词散列表构造算法[J].中文信息学报,2009,23(4):124-128. 被引量：3
4黄河燕,陈肇雄,胡曾剑.IHSMTS中实例模式获取机制的设计与实现[J].计算机研究与发展,2002,39(5):588-592. 被引量：3
5黄河燕,陈肇雄,张孝飞,张克亮.大规模句子相似度计算方法[J].中文信息学报,2006,20(B03):47-52. 被引量：6
6陈玉博,何世柱,刘康,赵军,吕学强.融合多种特征的实体链接技术研究[J].中文信息学报,2016,30(4):176-183.
7廖剑,李玉鑑.基于句子比较的英汉翻译模板自动提取算法[J].计算机工程与应用,2006,42(25):176-179.
8晋薇,夏云庆,王建德.多策略机器翻译系统IHSMTS中模式库的设计[J].微型电脑应用,2002,18(3):5-9. 被引量：1
9刘竞,赵友刚,韩仲志.基于免疫计算的概念提取方法研究[J].微计算机信息,2009,25(3):251-252. 被引量：2
10吴宏洲.句子比较相似度的算法实现[J].电脑知识与技术,2016,0(3):183-189.

小型微型计算机系统

2005年第3期

浏览历史

内容加载中请稍等...

多策略机器翻译系统IHSMTS中候选实例模式检索算法被引量：2

参考文献10

共引文献17

同被引文献16

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

多策略机器翻译系统IHSMTS中候选实例模式检索算法 被引量：2

参考文献10

共引文献17

同被引文献16

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

多策略机器翻译系统IHSMTS中候选实例模式检索算法被引量：2