摘要
本文针对传统命名实体识别方法中存在严重依赖大量人工特征导致文本特征表示不充分的问题,提出一种基于Seq2Seq模型的命名实体识别方法。利用BRET预训练模型动态生成字的语义向量,通过Seq2Seq模型中的编码器对字向量进行编码,并引入注意力机制为词语分配权重,从而获取文本局部特征和全局特征。最后,将得到的特征输入到解码器中,通过softmax层预测序列标签。实验结果表明,该方法在准确率、召回率与F1值上均有所提升,具有良好的适用性。
In order to solve the problem that traditional named entity recognition heavily depend on a large number of artificial features and cause insufficient text feature representation,this paper proposes a named entity recognition method based on Seq2Seq model.Firstly,the method first uses the BRET pre-training model to dynamically generate semantic vectors for words.Secondly,the word vector is encoded through the encoder in the Seq2Seq model,and the attention mechanism is introduced to assign weights of the words to obtain local and global features of the text.Finally,the obtained features are input into the decoder,and through the softmax layer to predict the sequence labels.The experimental results showthat the method has improved in accuracy,recall and F1,and has better applicability.
作者
王卫红
冯倩
吕红燕
曹玉辉
WANG Weihong;FENG Qian;LV Hongyan;CAO Yuhui(School of Information Technology,Hebei University of Economics and Business,Shijiazhuang 050061,China)
出处
《智能计算机与应用》
2020年第7期141-146,共6页
Intelligent Computer and Applications
基金
留学回国人员择优资助项目(C2015003042)
河北省自然科学基金青年项目(F2015207009)