摘要
蒙医方剂的命名实体识别在构建蒙医学领域知识图谱中具有重要的作用。蒙医方剂领域的命名实体识别存在长短依赖建模困难、实体边界识别不准确等挑战。为解决上述挑战,构建了基于多层次特征融合与序列依赖建模的蒙医方剂命名实体识别方法 MBTC命名实体识别模型。通过MacBERT捕获蒙医文本的深层语义,引入多尺度卷积-膨胀感知特征模块,其将长短依赖同时纳入同一特征空间来解决长短距离依赖关系的问题;并以BiLSTM-CRF联合解码来联合强化标签依赖且校正边界,以提升识别精度与标签一致性。在汇集经典蒙医药书籍和权威网站数据并经蒙医药专家审核构建的蒙医方剂数据集上,与7种主流基线模型对比实验表明,MBTC取得了最高F_1值为87.7%。同时在公开数据集《人民日报》上验证了该模型的泛化性。
The named entity recognition of Mongolian medicine prescriptions plays an important role in the con-struction of knowledge graph in the field of Mongolian medicine.There are challenges in the named entity recogni-tion in the field of Mongolian medical prescriptions,such as the difficulty of long-term dependence modeling and the inaccurate recognition of entity boundaries.In order to solve the above challenges,an MBTC named entity recog-nition model based on multi-level feature fusion and sequence-dependent modeling was constructed.By capturing the deep semantic of Mongolian medical texts through MacBERT,and a multi-scale convolution-expansion percep-tion feature module is introduced,which incorporates the long-short dependence into the same feature space at the same time to solve the problem of long-distance dependence.BiLSTM-CRF joint decoding is used to jointly strengthen the label dependence and correct the boundary,so as to improve the recognition accuracy and label con-sistency.Based on the Mongolian medicine formula dataset constructed by integrating classic Mongolian medicine books and authoritative website data and after being reviewed by Mongolian medicine experts,a comparative experi-ment was conducted with seven mainstream baseline models.The results showed that MBTC achieved the highest F1 value of 87.7%.At the same time,the generalization ability of this model was verified on the public dataset of People’s Daily.
作者
杨一帆
刘忠博
白青海
张军
刁宇峰
周玉新
YANG Yifan;LIU Zhongbo;BAI Qinghai;ZHANG Jun;DIAO Yufeng;ZHOU Yuxin(College of Computer Science and Technology,Inner Mongolia Minzu University,Tongliao 028043,China)
出处
《内蒙古民族大学学报(自然科学版)》
2025年第5期51-60,共10页
Journal of Inner Mongolia Minzu University:Natural Sciences Edition
基金
内蒙古自治区自然科学基金面上项目(2022MS06028)
内蒙古自治区研究生科研创新项目(KC2024074S)
内蒙古民族大学智慧农牧创新团队项目
内蒙古民族大学博士科研启动基金项目(BS438)。
关键词
实体识别
蒙医方剂
空洞卷积
entity recognition
Mongolian medicine prescriptions
hollow convolution