Knowledge Graphs(KGs)are pivotal for effectively organizing and managing structured information across various applications.Financial KGs have been successfully employed in advancing applications such as audit,anti-fr...Knowledge Graphs(KGs)are pivotal for effectively organizing and managing structured information across various applications.Financial KGs have been successfully employed in advancing applications such as audit,anti-fraud,and anti-money laundering.Despite their success,the construction of Chinese financial KGs has seen limited research due to the complex semantics.A significant challenge is the overlap triples problem,where entities feature in multiple relations within a sentence,hampering extraction accuracy-more than 39%of the triples in Chinese datasets exhibit the overlap triples.To address this,we propose the Entity-type-Enriched Cascaded Neural Network(E^(2)CNN),leveraging special tokens for entity boundaries and types.E^(2)CNN ensures consistency in entity types and excludes specific relations,mitigating overlap triple problems and enhancing relation extraction.Besides,we introduce the available Chinese financial dataset FINCORPUS.CN,annotated from annual reports of 2,000 companies,containing 48,389 entities and 23,368 triples.Experimental results on the DUIE dataset and FINCORPUS.CN underscore E^(2)CNN’s superiority over state-of-the-art models.展开更多
基金supported in part by the National Key R&D Program of China(Grant No.2020AAA0108501).
文摘Knowledge Graphs(KGs)are pivotal for effectively organizing and managing structured information across various applications.Financial KGs have been successfully employed in advancing applications such as audit,anti-fraud,and anti-money laundering.Despite their success,the construction of Chinese financial KGs has seen limited research due to the complex semantics.A significant challenge is the overlap triples problem,where entities feature in multiple relations within a sentence,hampering extraction accuracy-more than 39%of the triples in Chinese datasets exhibit the overlap triples.To address this,we propose the Entity-type-Enriched Cascaded Neural Network(E^(2)CNN),leveraging special tokens for entity boundaries and types.E^(2)CNN ensures consistency in entity types and excludes specific relations,mitigating overlap triple problems and enhancing relation extraction.Besides,we introduce the available Chinese financial dataset FINCORPUS.CN,annotated from annual reports of 2,000 companies,containing 48,389 entities and 23,368 triples.Experimental results on the DUIE dataset and FINCORPUS.CN underscore E^(2)CNN’s superiority over state-of-the-art models.