摘要
目前,在蒙古语语料库数据管理上还存在元数据信息不完整、格式不统一等问题,这不仅不利于语料的管理和升级,对今后建立蒙古语语料库网络共享资源也会造成阻碍。针对这种情况,可以用XML语言组织蒙古语语料。
We still have some problems in the data management of Mongolian corpora: most of the metadata are absent, and the formats of the different corpora are not the same. These problems will be especially serious when we establish Mongolian-corpus network resources. To improve the management and usage of the Mongolian corpora, the author suggests in this article organizing and managing Mongolian corpora with XML, and she designs a set of marking elements, trying to make them fit the features of Mongolian language.
出处
《内蒙古大学学报(哲学社会科学版)》
CSSCI
北大核心
2006年第1期13-16,共4页
Journal of Inner Mongolia University(Philosophy and Social Sciences)
基金
国家863计划(项目号:2003AA115510)
国家自然科学基金项目(项目号:36963005)