This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning doc...This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning document at the distance learning system database. This test covered following things: 1) to parse word structure at the distance learning system database documents and Cyrillic Mongolian language documents at the section, to form new documents by algorithm for identifying word stem;2) to test optimized content extraction from text material based on e-test results (key word, correct answer, base form with affix and new form formed by word stem without affix) at distance learning system, also to search key word by automatically selecting using word extraction algorithm;3) to test Boolean and probabilistic retrieval method through extended vector space retrieval method. This chapter covers: to process document content extraction retrieval algorithm, to propose recommendations query through word stem, not depending on word position based on Cyrillic Mongolian language documents distinction.展开更多
Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English B...Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English Book V-VⅢ we can come to see that the ed-itors attempt to mix the content(ideas)with the corresponding techniques.This is展开更多
Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lac...Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lack formality. In this paper, we propose how to improve NMT formality with large language models (LLMs), which combines the style transfer and evaluation capabilities of an LLM and the high-quality translation generation ability of NMT models to improve NMT formality. The proposed method (namely INMTF) encompasses two approaches. The first involves a revision approach using an LLM to revise the NMT-generated translation, ensuring a formal translation style. The second approach employs an LLM as a reward model for scoring translation formality, and then uses reinforcement learning algorithms to fine-tune the NMT model to maximize the reward score, thereby enhancing the formality of the generated translations. Considering the substantial parameter size of LLMs, we also explore methods to reduce the computational cost of INMTF. Experimental results demonstrate that INMTF significantly outperforms baselines in terms of translation formality and translation quality, with an improvement of +9.19 style accuracy points in the German-to-English task and +2.16 COMET score in the Russian-to-English task. Furthermore, our work demonstrates the potential of integrating LLMs within NMT frameworks to bridge the gap between NMT outputs and the formality required in various real-world translation scenarios.展开更多
We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the ...We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the caption style generated by the model too monotonous,which is difficult to meet people’s demands for vivid and stylized image captions.Therefore,we propose an image captioning model that combines text style transfer and image emotion recognition methods,with which the model can better understand images and generate controllable stylized captions.The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module,better understand the image content,and control the description through the text style transfermethod,thereby generating captions thatmeet people’s expectations.To our knowledge,this is the first work to use both image emotion recognition and text style control.展开更多
文摘This paper had developed and tested optimized content extraction algorithm using NLP method, TFIDF method for word of weight, VSM for information search, cosine method for similar quality calculation from learning document at the distance learning system database. This test covered following things: 1) to parse word structure at the distance learning system database documents and Cyrillic Mongolian language documents at the section, to form new documents by algorithm for identifying word stem;2) to test optimized content extraction from text material based on e-test results (key word, correct answer, base form with affix and new form formed by word stem without affix) at distance learning system, also to search key word by automatically selecting using word extraction algorithm;3) to test Boolean and probabilistic retrieval method through extended vector space retrieval method. This chapter covers: to process document content extraction retrieval algorithm, to propose recommendations query through word stem, not depending on word position based on Cyrillic Mongolian language documents distinction.
文摘Writing style is the essential issue even at the early stage the beginners who learnto read and write have to confront.From the part-Notes on reading and writing beforethe part of exercises of each lesson in English Book V-VⅢ we can come to see that the ed-itors attempt to mix the content(ideas)with the corresponding techniques.This is
文摘Preserving formal style in neural machine translation (NMT) is essential, yet often overlooked as an optimization objective of the training processes. This oversight can lead to translations that, though accurate, lack formality. In this paper, we propose how to improve NMT formality with large language models (LLMs), which combines the style transfer and evaluation capabilities of an LLM and the high-quality translation generation ability of NMT models to improve NMT formality. The proposed method (namely INMTF) encompasses two approaches. The first involves a revision approach using an LLM to revise the NMT-generated translation, ensuring a formal translation style. The second approach employs an LLM as a reward model for scoring translation formality, and then uses reinforcement learning algorithms to fine-tune the NMT model to maximize the reward score, thereby enhancing the formality of the generated translations. Considering the substantial parameter size of LLMs, we also explore methods to reduce the computational cost of INMTF. Experimental results demonstrate that INMTF significantly outperforms baselines in terms of translation formality and translation quality, with an improvement of +9.19 style accuracy points in the German-to-English task and +2.16 COMET score in the Russian-to-English task. Furthermore, our work demonstrates the potential of integrating LLMs within NMT frameworks to bridge the gap between NMT outputs and the formality required in various real-world translation scenarios.
基金supported by the National Key Research&Development Program (Grant No.2018YFC0831700)National Natural Science Foundation of China (Grant No.61671064,No.61732005).
文摘We propose a novel unsupervised image captioning method.Image captioning involves two fields of deep learning,natural language processing and computer vision.The excessive pursuit ofmodel evaluation results makes the caption style generated by the model too monotonous,which is difficult to meet people’s demands for vivid and stylized image captions.Therefore,we propose an image captioning model that combines text style transfer and image emotion recognition methods,with which the model can better understand images and generate controllable stylized captions.The proposed method can automatically judge the emotion contained in the image through the image emotion recognition module,better understand the image content,and control the description through the text style transfermethod,thereby generating captions thatmeet people’s expectations.To our knowledge,this is the first work to use both image emotion recognition and text style control.