期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
TibetanQA2.0:Dataset with Unanswerable Questions for Tibetan Machine Reading Comprehension 被引量:1
1
作者 Zhengcuo Dan Yuan Sun 《Data Intelligence》 2024年第4期1158-1167,共10页
How to improve the metacognition abilities of the models is a hot topic in Machine Reading Comprehension(MRC),a dataset with unanswerable questions is an effective way to test the abilities.There are many datasets wit... How to improve the metacognition abilities of the models is a hot topic in Machine Reading Comprehension(MRC),a dataset with unanswerable questions is an effective way to test the abilities.There are many datasets with unanswerable questions for MRC in Chinese and English,but the related research in low-resource languages such as Tibetan is in initial progress.TibetanQA is an extractive dataset with answerable questions for MRC in Tibetan,which contains 20,000 question-and-answer pairs and 1,513 articles,but this dataset mainly focuses on answerable questions,which models trained on this dataset may produce less accurate predictions when encountering questions that do not have a precise answer in the text.To address these weaknesses,this paper constructs the Dataset with Unanswerable Questions for Tibetan Machine Reading Comprehension(TibetanQA2.0).This dataset was constructed by crowd workers,and contains 505 passages and 1,347 unanswerable question-and-answer pairs.The passages cover six topics,which are geography,biology,history,nature,culture,and medicine.The dataset's quality was ensured by unifying the use of interrogative particles,correcting spelling and grammar errors,and verifying its compatibility with the Tibetan context. 展开更多
关键词 Machine reading comprehension unanswerable TIBETAN DATASET Low-resource language
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部