摘要
为进一步提高实体关系抽取的效果,对传统标注方案进行改进,构建一个融合预训练模型和神经网络的联合抽取模型。利用RoBERTa(robustly optimized BERT approach)和Bi-LSTM(bi-directional long short-term memory)对文本进行编码,对上下文信息进行建模后,通过CRF(conditional random fields)识别实体,利用LSTM(long short term memory)进行关系分类。在中文数据集DuIE上进行消融实验和对比实验,此模型的F1指标达到77.1%,精确率高达78.3%,两项指标比当前主流模型FETI分别提高了1.3%和2.6%,实验结果验证了该模型的优势。
To improve the effects of entity relation extraction,the traditional annotation scheme was improved.A joint extraction model was constructed by integrating the pre-trained language model and the neural network.The input text was encoded using RoBERTa and Bi-LSTM,and after modeling the contextual information,the entities were identified using CRF,and the relationships were classified using LSTM.The proposed model was experimented on the DuIE datasets,with the F1 value of 77.1%and the accuracy rate of 78.3%,both of which are 1.3%and 2.6%better than that of the current state-of-the-art model FETI,respectively.Experimental results validate the advantages of the model.
作者
邓成汝
凌捷
DENG Cheng-ru;LING Jie(School of Computer Science and Technology,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机工程与设计》
北大核心
2023年第7期2023-2029,共7页
Computer Engineering and Design
基金
广东省重点领域研发计划基金项目(2019B010139002)
广州市重点领域研发计划基金项目(202007010004)。