Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English B...Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English Book V-VⅢ we can come to see that the ed-itors attempt to mix the content(ideas)with the corresponding techniques.This is展开更多
Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lac...Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lack formality. In this paper, we propose how to improve NMT formality with large language models (LLMs), which combines the style transfer and evaluation capabilities of an LLM and the high-quality translation generation ability of NMT models to improve NMT formality. The proposed method (namely INMTF) encompasses two approaches. The first involves a revision approach using an LLM to revise the NMT-generated translation, ensuring a formal translation style. The second approach employs an LLM as a reward model for scoring translation formality, and then uses reinforcement learning algorithms to fine-tune the NMT model to maximize the reward score, thereby enhancing the formality of the generated translations. Considering the substantial parameter size of LLMs, we also explore methods to reduce the computational cost of INMTF. Experimental results demonstrate that INMTF significantly outperforms baselines in terms of translation formality and translation quality, with an improvement of +9.19 style accuracy points in the German-to-English task and +2.16 COMET score in the Russian-to-English task. Furthermore, our work demonstrates the potential of integrating LLMs within NMT frameworks to bridge the gap between NMT outputs and the formality required in various real-world translation scenarios.展开更多
大语言模型(large language models,LLMs)在知识存储方面不断增强的能力展示了其作为知识库的潜在效用.然而,任何给定的提示只能提供大语言模型所涵盖知识的下限估计.在语言模型即知识库(language models as knowledge bases,LMs-as-KBs...大语言模型(large language models,LLMs)在知识存储方面不断增强的能力展示了其作为知识库的潜在效用.然而,任何给定的提示只能提供大语言模型所涵盖知识的下限估计.在语言模型即知识库(language models as knowledge bases,LMs-as-KBs)的场景中,先前的提示学习方法忽略了查询风格对模型表现的影响.揭示了大语言模型确实具有与查询风格相关的可学习偏好,并且利用大语言模型的这种特性引入了查询风格自适应转换(adaptive query style transfer,ARES)方法,通过适应大语言模型的偏好来增强其知识查询的表现. ARES方法从构造查询候选集开始,通过改写实现多种表达风格的纳入.随后,ARES训练一个评估器来学习并识别大语言模型对查询风格的偏好,评估查询候选集并选择潜在最优查询.在多个数据集上进行的实验表明了该方法在提高大语言模型即知识库服务上查询准确率的有效性,增量对比原始模型与3个基线方法分别实现了平均2.26个百分点,1.68个百分点,1.19个百分点,1.17个百分点的提升,这表明ARES可以与其他方法有效地结合使用,从而实现多角度的查询表现增强.展开更多
This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning doc...This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning document at the distance learning system database. This test covered following things: 1) to parse word structure at the distance learning system database documents and Cyrillic Mongolian language documents at the section, to form new documents by algorithm for identifying word stem;2) to test optimized content extraction from text material based on e-test results (key word, correct answer, base form with affix and new form formed by word stem without affix) at distance learning system, also to search key word by automatically selecting using word extraction algorithm;3) to test Boolean and probabilistic retrieval method through extended vector space retrieval method. This chapter covers: to process document content extraction retrieval algorithm, to propose recommendations query through word stem, not depending on word position based on Cyrillic Mongolian language documents distinction.展开更多
This paper aims to solve two major stylistic issues a translation team faces in carrying out a translation project. First,what is the style to be established for the TT? What should be done to achieve a uniform transl...This paper aims to solve two major stylistic issues a translation team faces in carrying out a translation project. First,what is the style to be established for the TT? What should be done to achieve a uniform translation style of the team?展开更多
文摘Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English Book V-VⅢ we can come to see that the ed-itors attempt to mix the content(ideas)with the corresponding techniques.This is
文摘Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lack formality. In this paper, we propose how to improve NMT formality with large language models (LLMs), which combines the style transfer and evaluation capabilities of an LLM and the high-quality translation generation ability of NMT models to improve NMT formality. The proposed method (namely INMTF) encompasses two approaches. The first involves a revision approach using an LLM to revise the NMT-generated translation, ensuring a formal translation style. The second approach employs an LLM as a reward model for scoring translation formality, and then uses reinforcement learning algorithms to fine-tune the NMT model to maximize the reward score, thereby enhancing the formality of the generated translations. Considering the substantial parameter size of LLMs, we also explore methods to reduce the computational cost of INMTF. Experimental results demonstrate that INMTF significantly outperforms baselines in terms of translation formality and translation quality, with an improvement of +9.19 style accuracy points in the German-to-English task and +2.16 COMET score in the Russian-to-English task. Furthermore, our work demonstrates the potential of integrating LLMs within NMT frameworks to bridge the gap between NMT outputs and the formality required in various real-world translation scenarios.
文摘This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning document at the distance learning system database. This test covered following things: 1) to parse word structure at the distance learning system database documents and Cyrillic Mongolian language documents at the section, to form new documents by algorithm for identifying word stem;2) to test optimized content extraction from text material based on e-test results (key word, correct answer, base form with affix and new form formed by word stem without affix) at distance learning system, also to search key word by automatically selecting using word extraction algorithm;3) to test Boolean and probabilistic retrieval method through extended vector space retrieval method. This chapter covers: to process document content extraction retrieval algorithm, to propose recommendations query through word stem, not depending on word position based on Cyrillic Mongolian language documents distinction.
文摘This paper aims to solve two major stylistic issues a translation team faces in carrying out a translation project. First,what is the style to be established for the TT? What should be done to achieve a uniform translation style of the team?