期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Fine Tuning Language Models:A Tale of Two Low-Resource Languages 被引量:1
1
作者 Rosel Oida-Onesa Melvin A.Ballera 《Data Intelligence》 2024年第4期946-967,共22页
Creating a parallel corpus for machine translation is a challenging and time-consuming task,especially in a linguistically diverse country like the Philippines,with 185 languages.Although a wealth of text is available... Creating a parallel corpus for machine translation is a challenging and time-consuming task,especially in a linguistically diverse country like the Philippines,with 185 languages.Although a wealth of text is available,annotated data is scarce,particularly for languages like Bikol.Bikol is one of the major languages in the Philippines;however,its underrepresentation in the digital sphere is attributed to the absence of annotated data.This study outlines the development process of BFParCo,a proposed gold standard dataset for the Bikol and Filipino parallel corpus.The corpus underwent refinement through manual phrase alignment,translation,and evaluation.Subsequently,T5 and mT5 transformer models were fine-tuned with the parallel corpus and were evaluated using the BLEU metric.The results showed a notable improvement in Bilingual Evaluation Understudy(BLEU)score after fine-tuning,with an increase of 60.68 in BIK→FIL and 58.93 in FIL→BIK translations.Additionally,human evaluators comprehensively assessed the fine-tuned models'results using Multidimensional Quality Metrics and Scalar Quality Metrics error taxonomies.The fine-tuned models then were made publicly accessible through Hugging Face.This study represents a significant stride in advancing machine translation tools for Bikol and Filipino languages. 展开更多
关键词 Natural language processing Language models Transfer learning Fine-tuning Low resource language bikol FILIPINO
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部