摘要
针对经典BiLSTM-CRF命名实体识别模型训练时间长、无法解决一词多义及不能充分学习文本上下文语义信息的问题,提出一种基于BERT-BiGRU-Attention-CRF的中文命名实体识别模型.首先利用BERT语言模型预训练词向量,以弥补传统词向量模型无法解决一词多义的问题;其次,利用双向门控循环单元(BiGRU)神经网络层对文本深层次的信息进行特征提取,计算每个标签的预测分值,得到句子的隐藏状态序列;然后利用注意力机制(Attention)层对词加权表征,挖掘词间的关联关系,得到新预测分值,新状态序列;最后通过条件随机场(CRF)对新预测分值计算全局最优解,从而获得模型对实体标签的最终预测结果.通过在MSRA语料上的实验,结果表明文中模型的有效性.
Aiming at the problems of long training time of classic BiLSTM-CRF named entity recognition model,inability to resolve polysemy,and insufficient learming of text context semantic information,a Chinese named entity recognition model based on BERT-BiCRU-Attention-CRF is proposed.Firstly,the BERT language model is used to pre-train the word vector to make up for the problem that the traditional word vector model cannot solve the problem of polysemy.Secondly,the bi-directional gated recurrent unit(BiGRU)neural network layer is applied to extract the features of the deep information of the text,to calculate the predicted score of each label to get the hidden state sequence of the sentence.Thirdly,the attention layer is utilized to weight the representations of the words and mine the association between the words to get new predicted scores and new state sequences.Finally,the conditional random field(CRF)is used to calculate the global optimal solution for the new prediction score,so as to obtain the final prediction result of the model on the entity label.Through the experiments with MSRA corpus,the results show that the new model is effective.
作者
王雪梅
陶宏才
WANG Xuemei;TAO Hongcai(College of Information Science&Technology,Southwest Jiaotong University,Chengdu 611756,China)
出处
《成都信息工程大学学报》
2020年第3期264-270,共7页
Journal of Chengdu University of Information Technology
基金
国家自然科学基金资助项目(61806170)。