Causality extraction has become a crucial task in natural language processing and knowledge graph.However,most existing methods divide causality extraction into two subtasks:extraction of candidate causal pairs and cl...Causality extraction has become a crucial task in natural language processing and knowledge graph.However,most existing methods divide causality extraction into two subtasks:extraction of candidate causal pairs and classification of causality.These methods result in cascading errors and the loss of associated contextual information.Therefore,in this study,based on graph theory,an End-to-end Multi-Granulation Causality Extraction model(EMGCE)is proposed to extract explicit causality and directly mine implicit causality.First,the sentences are represented on different granulation layers,that contain character,word,and contextual string layers.The word layer is fine-grained into three layers:word-index,word-embedding and word-position-embedding layers.Then,a granular causality tree of dataset is built based on the word-index layer.Next,an improved tagREtriplet algorithm is designed to obtain the labeled causality based on the granular causality tree.It can transform the task into a sequence labeling task.Subsequently,the multi-granulation semantic representation is fed into the neural network model to extract causality.Finally,based on the extended public SemEval 2010 Task 8 dataset,the experimental results demonstrate that EMGCE is effective.展开更多
基金supported in part by the National Natural Science Foundation of China(No.62221005)the National Key Research and Development Program of China(No.2021YFF0704101,No.2020YFC2003502)+2 种基金the National Natural Science Foundation of China(No.61876201)the Natural Science Foundation of Chongqing(No.cstc2019jcyj-cxtt X0002,No.cstc2021ycjh-bgzxm0013)the key cooperation project of chongqing municipal education commission(HZ2021008)。
文摘Causality extraction has become a crucial task in natural language processing and knowledge graph.However,most existing methods divide causality extraction into two subtasks:extraction of candidate causal pairs and classification of causality.These methods result in cascading errors and the loss of associated contextual information.Therefore,in this study,based on graph theory,an End-to-end Multi-Granulation Causality Extraction model(EMGCE)is proposed to extract explicit causality and directly mine implicit causality.First,the sentences are represented on different granulation layers,that contain character,word,and contextual string layers.The word layer is fine-grained into three layers:word-index,word-embedding and word-position-embedding layers.Then,a granular causality tree of dataset is built based on the word-index layer.Next,an improved tagREtriplet algorithm is designed to obtain the labeled causality based on the granular causality tree.It can transform the task into a sequence labeling task.Subsequently,the multi-granulation semantic representation is fed into the neural network model to extract causality.Finally,based on the extended public SemEval 2010 Task 8 dataset,the experimental results demonstrate that EMGCE is effective.