Entity relation extraction,a fundamental and essential task in natural language processing(NLP),has garnered significant attention over an extended period.,aiming to extract the core of semantic knowledge from unstruc...Entity relation extraction,a fundamental and essential task in natural language processing(NLP),has garnered significant attention over an extended period.,aiming to extract the core of semantic knowledge from unstructured text,i.e.,entities and the relations between them.At present,the main dilemma of Chinese entity relation extraction research lies in nested entities,relation overlap,and lack of entity relation interaction.This dilemma is particularly prominent in complex knowledge extraction tasks with high-density knowledge,imprecise syntactic structure,and lack of semantic roles.To address these challenges,this paper presents an innovative“character-level”Chinese part-of-speech(CN-POS)tagging approach and incorporates part-of-speech(POS)information into the pre-trained model,aiming to improve its semantic understanding and syntactic information processing capabilities.Additionally,A relation reference filling mechanism(RF)is proposed to enhance the semantic interaction between relations and entities,utilize relations to guide entity modeling,improve the boundary prediction ability of entity models for nested entity phenomena,and increase the cascading accuracy of entity-relation triples.Meanwhile,the“Queue”sub-task connection strategy is adopted to alleviate triplet cascading errors caused by overlapping relations,and a Syntax-enhanced entity relation extraction model(SE-RE)is constructed.The model showed excellent performance on the self-constructed E-commerce Product Information dataset(EPI)in this article.The results demonstrate that integrating POS enhancement into the pre-trained encoding model significantly boosts the performance of entity relation extraction models compared to baseline methods.Specifically,the F1-score fluctuation in subtasks caused by error accumulation was reduced by 3.21%,while the F1-score for entity-relation triplet extraction improved by 1.91%.展开更多
基金funded by the National Key Technology R&D Program of China under Grant No.2021YFD2100605the National Natural Science Foundation of China under Grant No.62433002+1 种基金the Project of Construction and Support for High-Level Innovative Teams of Beijing Municipal Institutions under Grant No.BPHR20220104Beijing Scholars Program under Grant No.099.
文摘Entity relation extraction,a fundamental and essential task in natural language processing(NLP),has garnered significant attention over an extended period.,aiming to extract the core of semantic knowledge from unstructured text,i.e.,entities and the relations between them.At present,the main dilemma of Chinese entity relation extraction research lies in nested entities,relation overlap,and lack of entity relation interaction.This dilemma is particularly prominent in complex knowledge extraction tasks with high-density knowledge,imprecise syntactic structure,and lack of semantic roles.To address these challenges,this paper presents an innovative“character-level”Chinese part-of-speech(CN-POS)tagging approach and incorporates part-of-speech(POS)information into the pre-trained model,aiming to improve its semantic understanding and syntactic information processing capabilities.Additionally,A relation reference filling mechanism(RF)is proposed to enhance the semantic interaction between relations and entities,utilize relations to guide entity modeling,improve the boundary prediction ability of entity models for nested entity phenomena,and increase the cascading accuracy of entity-relation triples.Meanwhile,the“Queue”sub-task connection strategy is adopted to alleviate triplet cascading errors caused by overlapping relations,and a Syntax-enhanced entity relation extraction model(SE-RE)is constructed.The model showed excellent performance on the self-constructed E-commerce Product Information dataset(EPI)in this article.The results demonstrate that integrating POS enhancement into the pre-trained encoding model significantly boosts the performance of entity relation extraction models compared to baseline methods.Specifically,the F1-score fluctuation in subtasks caused by error accumulation was reduced by 3.21%,while the F1-score for entity-relation triplet extraction improved by 1.91%.