Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods ...Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods attempt to model the topical coherence among all linked entities,most of them failed in exploiting the correlations among manifold knowledge helpful for linking,such as the semantics of mentions and their candidates,the neighborhood information of candidate entities in KB and the fine-grained type information of entities.As we will show in the paper,interactions among these types of information are very useful for better characterizing the topic features of entities and more accurately estimating the topical coherence among all the referred entities within the same document.In this paper,we present a novel HEterogeneous Graph-based Entity Linker(HEGEL)for global entity linking,which builds an informative heterogeneous graph for every document to collect various linking clues.Then HEGEL utilizes a novel heterogeneous graph neural network(HGNN)to integrate the different types of manifold information and model the interactions among them.Experiments on the standard benchmark datasets demonstrate that HEGEL can well capture the global coherence and outperforms the prior state-of-the-art EL methods.展开更多
In the tobacco industry,insider employee attack is a thorny problem that is difficult to detect.To solve this issue,this paper proposes an insider threat detection method based on heterogeneous graph embedding.First,t...In the tobacco industry,insider employee attack is a thorny problem that is difficult to detect.To solve this issue,this paper proposes an insider threat detection method based on heterogeneous graph embedding.First,the interrelationships between logs are fully considered,and log entries are converted into heterogeneous graphs based on these relationships.Second,the heterogeneous graph embedding is adopted and each log entry is represented as a low-dimensional feature vector.Then,normal logs and malicious logs are classified into different clusters by clustering algorithm to identify malicious logs.Finally,the effectiveness and superiority of the method is verified through experiments on the CERT dataset.The experimental results show that this method has better performance compared to some baseline methods.展开更多
Heterogeneous graphs generally refer to graphs with different types of nodes and edges.A common approach for extracting useful information from heterogeneous graphs is to use meta-graphs,which can be seen as a special...Heterogeneous graphs generally refer to graphs with different types of nodes and edges.A common approach for extracting useful information from heterogeneous graphs is to use meta-graphs,which can be seen as a special kind of directed acyclic graph with same node and edge types as the heterogeneous graph.However,how to design proper metagraphs is challenging.Recently,there have been many works on learning suitable metagraphs from a heterogeneous graph.Existing methods generally introduce continuous weights for edges that are independent of each other,which ignores the topological structures of meta-graphs and can be ineffective.To address this issue,the authors propose a new viewpoint from tensor on learning meta-graphs.Such a viewpoint not only helps interpret the limitation of existing works by CANDECOMP/PARAFAC(CP)decomposition,but also inspires us to propose a topology-aware tensor decomposition,called TENSUS,that reflects the structure of DAGs.The proposed topology-aware tensor decomposition is easy to use and simple to implement,and it can be taken as a plug-in part to upgrade many existing works,including node classification and recommendation on heterogeneous graphs.Experimental results on different tasks demonstrate that the proposed method can significantly improve the state-of-the-arts for all these tasks.展开更多
Smart contracts have signifcant losses due to various types of vulnerabilities. However, traditional vulnerability detec-tionmethods rely extensively on expert rules, resulting in low detection accuracy and poor adapt...Smart contracts have signifcant losses due to various types of vulnerabilities. However, traditional vulnerability detec-tionmethods rely extensively on expert rules, resulting in low detection accuracy and poor adaptability to novel attacks. To address these problems, in this paper, deep learning methods are combined with smart contract vulner-abilitycode detection approaches. Abstract syntax trees (ASTs), which are special isomorphic graph structures, are an important bridge between source code and graph neural networks. By learning the AST, the model can under-standthe semantics of the source code. Moreover, graph neural networks have an increasing ability to address com-plexheterogeneous graphs. Therefore, control fow graphs are fused with data fow graphs on the basis of the ASTs to build heterogeneous graphs with richer code semantics. Furthermore, multigranularity analysis of the vulnerability detection results is performed, including coarse-grained contract-level vulnerability detection and fne-grained line-levelvulnerability detection. Through this multigranularity detection approach, vulnerabilities in contracts can be identifed and analysed more comprehensively, providing a richer perspective and more solutions for vulnerability detection. The experimental results show that the proposed multigranularity vulnerability detection method based on heterogeneous graphs (MVD-HG) improves both the accuracy and range of the detected vulnerability types in contract-level vulnerability detection tasks;moreover, in the line-level vulnerability detection task, the MVD-HG model achieves signifcant results and addresses the shortcomings of existing methods. In addition, based on code generation methods used in related felds, a data enhancement method based on the source code is developed, which efectively expands the experimental dataset to address the reduced credibility of the results due to insufcient amounts of data.展开更多
Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.Ho...Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.展开更多
Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based o...Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based on graph convolutional network(GCN).Methods Clauses that contain symptoms,formulas,and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs,which were used to propose a node representation learning method based on GCN−the Traditional Chinese Medicine Graph Convolution Network(TCM-GCN).The symptom-formula,symptom-herb,and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes,and thus acquiring the nodes’sum-aggregations of symptoms,formulas,and herbs to lay a foundation for the downstream tasks of the prediction models.Results Comparisons among the node representations with multi-hot encoding,non-fusion encoding,and fusion encoding showed that the Precision@10,Recall@10,and F1-score@10 of the fusion encoding were 9.77%,6.65%,and 8.30%,respectively,higher than those of the non-fusion encoding in the prediction studies of the model.Conclusion Node representations by fusion encoding achieved comparatively ideal results,indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.展开更多
Distinguishing genuine news from false information is crucial in today’s digital era.Most of the existing methods are based on either the traditional neural network sequence model or graph neural network model that h...Distinguishing genuine news from false information is crucial in today’s digital era.Most of the existing methods are based on either the traditional neural network sequence model or graph neural network model that has become more popularity in recent years.Among these two types of models,the latter solve the former’s problem of neglecting the correlation among news sentences.However,one layer of the graph neural network only considers the information of nodes directly connected to the current nodes and omits the important information carried by distant nodes.As such,this study proposes the Extendable-to-Global Heterogeneous Graph Attention network(namely EGHGAT)to manage heterogeneous graphs by cleverly extending local attention to global attention and addressing the drawback of local attention that can only collect information from directly connected nodes.The shortest distance matrix is computed among all nodes on the graph.Specifically,the shortest distance information is used to enable the current nodes to aggregate information from more distant nodes by considering the influence of different node types on the current nodes in the current network layer.This mechanism highlights the importance of directly or indirectly connected nodes and the effect of different node types on the current nodes,which can substantially enhance the performance of the model.Information from an external knowledge base is used to compare the contextual entity representation with the entity representation of the corresponding knowledge base to capture its consistency with news content.Experimental results from the benchmark dataset reveal that the proposed model significantly outperforms the state-of-the-art approach.Our code is publicly available at https://github.com/gyhhk/EGHGAT_FakeNewsDetection.展开更多
Heterogeneous graphs contain multiple types of entities and relations,which are capable of modeling complex interactions.Embedding on heterogeneous graphs has become an essential tool for analyzing and understanding s...Heterogeneous graphs contain multiple types of entities and relations,which are capable of modeling complex interactions.Embedding on heterogeneous graphs has become an essential tool for analyzing and understanding such graphs.Although these meticulously designed methods make progress,they are limited by model design and computational resources,making it difficult to scale to large-scale heterogeneous graph data and hindering the application and promotion of these methods.In this paper,we propose Restage,a relation structure-aware hierarchical heterogeneous graph embedding framework.Under this framework,embedding only a smaller-scale graph with existing graph representation learning methods is sufficient to obtain node representations on the original heterogeneous graph.We consider two types of relation structures in heterogeneous graphs:interaction relations and affiliation relations.Firstly,we design a relation structure-aware coarsening method to successively coarsen the original graph to the top-level layer,resulting in a smaller-scale graph.Secondly,we allow any unsupervised representation learning methods to obtain node embeddings on the top-level graph.Finally,we design a relation structure-aware refinement method to successively refine the node embeddings from the top-level graph back to the original graph,obtaining node embeddings on the original graph.Experimental results on three public heterogeneous graph datasets demonstrate the enhanced scalability of representation learning methods by the proposed Restage.On another large-scale graph,the speed of existing representation learning methods is increased by up to eighteen times at most.展开更多
Evidential Document-level Event Factuality Identification(EvDEFI)aims to predict the factual nature of an event and extract evidential sentences from the document precisely.Previous work usually limited to only predic...Evidential Document-level Event Factuality Identification(EvDEFI)aims to predict the factual nature of an event and extract evidential sentences from the document precisely.Previous work usually limited to only predicting the factuality of an event with respect to a document,and neglected the interpretability of the task.As a more fine-grained and interpretable task,EvDEFI is still in the early stage.The existing model only used shallow similarity calculation to extract evidences,and employed simple attentions without lexical features,which is quite coarse-grained.Therefore,we propose a novel EvDEFI model named Heterogeneous and Extractive Graph Attention Network(HEGAT),which can update representations of events and sentences by multi-view graph attentions based on tokens and various lexical features from both local and global levels.Experiments on EB-DEF-v2 corpus demonstrate that HEGAT model is superior to several competitive baselines and can validate the interpretability of the task.展开更多
In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relationa...In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.展开更多
Efficient spectrum resource allocation in wireless heterogeneous networks is important for improving the system throughput and guaranteeing the user's Quality-of-Service(QoS).In this paper,we propose an enhanced a...Efficient spectrum resource allocation in wireless heterogeneous networks is important for improving the system throughput and guaranteeing the user's Quality-of-Service(QoS).In this paper,we propose an enhanced algorithm for spectrum resource allocation in heterogeneous networks.First,the bandwidth of each user is determined by the user's rate demand and the channel state.Second,graph theory is enhanced and used to improve the spectrum efficiency.Third,spectrum resource is dynamically split between macrocell and femtocells with the changes of users' conditions.Our simulation results show that the proposed algorithm improves the system throughput significantly and also guarantees the fairness for the users.展开更多
order to help investors understand the credit status of target corporations and reduce investment risks,the corporate credit rating model has become an important evaluation tool in the financial market.These models ar...order to help investors understand the credit status of target corporations and reduce investment risks,the corporate credit rating model has become an important evaluation tool in the financial market.These models are based on statistical learning,machine learning and deep learning especially graph neural networks(GNNs).However,we found that only few models take the hierarchy,heterogeneity or unlabeled data into account in the actual corporate credit rating process.Therefore,we propose a novel framework named hierarchical heterogeneous graph neural networks(HHGNN),which can fully model the hierarchy of corporate features and the heterogeneity of relationships between corporations.In addition,we design an adversarial learning block to make full use of the rich unlabeled samples in the financial data.Extensive experiments conducted on the public-listed corporate rating dataset prove that HHGNN achieves SOTA compared to the baseline methods.展开更多
Event detection(ED)seeks to recognize event triggers and classify them into the predefined event types.Chinese ED is formulated as a character-level task owing to the uncertain word boundaries.Prior methods try to inc...Event detection(ED)seeks to recognize event triggers and classify them into the predefined event types.Chinese ED is formulated as a character-level task owing to the uncertain word boundaries.Prior methods try to incorpo-rate word-level information into characters to enhance their semantics.However,they experience two problems.First,they fail to incorporate word-level information into each character the word encompasses,causing the insufficient word-charac-ter interaction problem.Second,they struggle to distinguish events of similar types with limited annotated instances,which is called the event confusing problem.This paper proposes a novel model named Label-Aware Heterogeneous Graph Attention Network(L-HGAT)to address these two problems.Specifically,we first build a heterogeneous graph of two node types and three edge types to maximally preserve word-character interactions,and then deploy a heterogeneous graph attention network to enhance the semantic propagation between characters and words.Furthermore,we design a pushing-away game to enlarge the predicting gap between the ground-truth event type and its confusing counterpart for each character.Experimental results show that our L-HGAT model consistently achieves superior performance over prior competitive methods.展开更多
The deep learning methods based on syntactic dependency tree have achieved great success on Aspect-based Sentiment Analysis(ABSA).However,the accuracy of the dependency parser cannot be determined,which may keep aspec...The deep learning methods based on syntactic dependency tree have achieved great success on Aspect-based Sentiment Analysis(ABSA).However,the accuracy of the dependency parser cannot be determined,which may keep aspect words away from its related opinion words in a dependency tree.Moreover,few models incorporate external affective knowledge for ABSA.Based on this,we propose a novel architecture to tackle the above two limitations,while fills up the gap in applying heterogeneous graphs convolution network to ABSA.Specially,we employ affective knowledge as an sentiment node to augment the representation of words.Then,linking sentiment node which have different attributes with word node through a specific edge to form a heterogeneous graph based on dependency tree.Finally,we design a multi-level semantic heterogeneous graph convolution network(Semantic-HGCN)to encode the heterogeneous graph for sentiment prediction.Extensive experiments are conducted on the datasets SemEval 2014 Task 4,SemEval 2015 task 12,SemEval 2016 task 5 and ACL 14 Twitter.The experimental results show that our method achieves the state-of-the-art performance.展开更多
Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and var...Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model(HGM) on electronic health record(EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network(CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.展开更多
基金supported in part by the National Key R&D Program of China(No.2020AAA0106600)the Key Laboratory of Science,Technology and Standard in Press Industry(Key Laboratory of Intelligent Press Media Technology)
文摘Entity Linking(EL)aims to automatically link the mentions in unstructured documents to corresponding entities in a knowledge base(KB),which has recently been dominated by global models.Although many global EL methods attempt to model the topical coherence among all linked entities,most of them failed in exploiting the correlations among manifold knowledge helpful for linking,such as the semantics of mentions and their candidates,the neighborhood information of candidate entities in KB and the fine-grained type information of entities.As we will show in the paper,interactions among these types of information are very useful for better characterizing the topic features of entities and more accurately estimating the topical coherence among all the referred entities within the same document.In this paper,we present a novel HEterogeneous Graph-based Entity Linker(HEGEL)for global entity linking,which builds an informative heterogeneous graph for every document to collect various linking clues.Then HEGEL utilizes a novel heterogeneous graph neural network(HGNN)to integrate the different types of manifold information and model the interactions among them.Experiments on the standard benchmark datasets demonstrate that HEGEL can well capture the global coherence and outperforms the prior state-of-the-art EL methods.
基金Supported by the National Natural Science Foundation of China(No.62203390)the Science and Technology Project of China TobaccoZhejiang Industrial Co.,Ltd(No.ZJZY2022E004)。
文摘In the tobacco industry,insider employee attack is a thorny problem that is difficult to detect.To solve this issue,this paper proposes an insider threat detection method based on heterogeneous graph embedding.First,the interrelationships between logs are fully considered,and log entries are converted into heterogeneous graphs based on these relationships.Second,the heterogeneous graph embedding is adopted and each log entry is represented as a low-dimensional feature vector.Then,normal logs and malicious logs are classified into different clusters by clustering algorithm to identify malicious logs.Finally,the effectiveness and superiority of the method is verified through experiments on the CERT dataset.The experimental results show that this method has better performance compared to some baseline methods.
基金National Key Research and Development Program of China,Grant/Award Number:2023YFB2903904。
文摘Heterogeneous graphs generally refer to graphs with different types of nodes and edges.A common approach for extracting useful information from heterogeneous graphs is to use meta-graphs,which can be seen as a special kind of directed acyclic graph with same node and edge types as the heterogeneous graph.However,how to design proper metagraphs is challenging.Recently,there have been many works on learning suitable metagraphs from a heterogeneous graph.Existing methods generally introduce continuous weights for edges that are independent of each other,which ignores the topological structures of meta-graphs and can be ineffective.To address this issue,the authors propose a new viewpoint from tensor on learning meta-graphs.Such a viewpoint not only helps interpret the limitation of existing works by CANDECOMP/PARAFAC(CP)decomposition,but also inspires us to propose a topology-aware tensor decomposition,called TENSUS,that reflects the structure of DAGs.The proposed topology-aware tensor decomposition is easy to use and simple to implement,and it can be taken as a plug-in part to upgrade many existing works,including node classification and recommendation on heterogeneous graphs.Experimental results on different tasks demonstrate that the proposed method can significantly improve the state-of-the-arts for all these tasks.
基金supported by the Major Program of Natural Science Foundation of Zhejiang Province(No.LD22F020002)the National Natural Science Founda-tion of China(Nos.62372410,U22B2028)+2 种基金the Zhejiang Provincial Natural Science Foundation of China(No.LZ23F020011)the Fundamental Research Funds for the Provincial Universities of Zhejiang(No.RF-A2023009)the Key R&D Projects in Zhejiang Province(No.2021C01117).
文摘Smart contracts have signifcant losses due to various types of vulnerabilities. However, traditional vulnerability detec-tionmethods rely extensively on expert rules, resulting in low detection accuracy and poor adaptability to novel attacks. To address these problems, in this paper, deep learning methods are combined with smart contract vulner-abilitycode detection approaches. Abstract syntax trees (ASTs), which are special isomorphic graph structures, are an important bridge between source code and graph neural networks. By learning the AST, the model can under-standthe semantics of the source code. Moreover, graph neural networks have an increasing ability to address com-plexheterogeneous graphs. Therefore, control fow graphs are fused with data fow graphs on the basis of the ASTs to build heterogeneous graphs with richer code semantics. Furthermore, multigranularity analysis of the vulnerability detection results is performed, including coarse-grained contract-level vulnerability detection and fne-grained line-levelvulnerability detection. Through this multigranularity detection approach, vulnerabilities in contracts can be identifed and analysed more comprehensively, providing a richer perspective and more solutions for vulnerability detection. The experimental results show that the proposed multigranularity vulnerability detection method based on heterogeneous graphs (MVD-HG) improves both the accuracy and range of the detected vulnerability types in contract-level vulnerability detection tasks;moreover, in the line-level vulnerability detection task, the MVD-HG model achieves signifcant results and addresses the shortcomings of existing methods. In addition, based on code generation methods used in related felds, a data enhancement method based on the source code is developed, which efectively expands the experimental dataset to address the reduced credibility of the results due to insufcient amounts of data.
基金supported byNationalNatural Science Foundation of China(52274205)and Project of Education Department of Liaoning Province(LJKZ0338).
文摘Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.
基金New-Generation Artificial Intelligence-Major Program in the Sci-Tech Innovation 2030 Agenda from the Ministry of Science and Technology of China(2018AAA0102100)Hunan Provincial Department of Education key project(21A0250)The First Class Discipline Open Fund of Hunan University of Traditional Chinese Medicine(2022ZYX08)。
文摘Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based on graph convolutional network(GCN).Methods Clauses that contain symptoms,formulas,and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs,which were used to propose a node representation learning method based on GCN−the Traditional Chinese Medicine Graph Convolution Network(TCM-GCN).The symptom-formula,symptom-herb,and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes,and thus acquiring the nodes’sum-aggregations of symptoms,formulas,and herbs to lay a foundation for the downstream tasks of the prediction models.Results Comparisons among the node representations with multi-hot encoding,non-fusion encoding,and fusion encoding showed that the Precision@10,Recall@10,and F1-score@10 of the fusion encoding were 9.77%,6.65%,and 8.30%,respectively,higher than those of the non-fusion encoding in the prediction studies of the model.Conclusion Node representations by fusion encoding achieved comparatively ideal results,indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.
基金supported by the National Natural Science Foundation of Xinjiang Province(Nos.2022TSYCTD0019 and 2022D01D32).
文摘Distinguishing genuine news from false information is crucial in today’s digital era.Most of the existing methods are based on either the traditional neural network sequence model or graph neural network model that has become more popularity in recent years.Among these two types of models,the latter solve the former’s problem of neglecting the correlation among news sentences.However,one layer of the graph neural network only considers the information of nodes directly connected to the current nodes and omits the important information carried by distant nodes.As such,this study proposes the Extendable-to-Global Heterogeneous Graph Attention network(namely EGHGAT)to manage heterogeneous graphs by cleverly extending local attention to global attention and addressing the drawback of local attention that can only collect information from directly connected nodes.The shortest distance matrix is computed among all nodes on the graph.Specifically,the shortest distance information is used to enable the current nodes to aggregate information from more distant nodes by considering the influence of different node types on the current nodes in the current network layer.This mechanism highlights the importance of directly or indirectly connected nodes and the effect of different node types on the current nodes,which can substantially enhance the performance of the model.Information from an external knowledge base is used to compare the contextual entity representation with the entity representation of the corresponding knowledge base to capture its consistency with news content.Experimental results from the benchmark dataset reveal that the proposed model significantly outperforms the state-of-the-art approach.Our code is publicly available at https://github.com/gyhhk/EGHGAT_FakeNewsDetection.
基金supported by the National Natural Science Foundation of China(Nos.1876001,61602003,and 61673020)the National High Technology Research and Development Program(No.2017YFB1401903)the Provincial Natural Science Foundation of Anhui Province(No.1708085QF156).
文摘Heterogeneous graphs contain multiple types of entities and relations,which are capable of modeling complex interactions.Embedding on heterogeneous graphs has become an essential tool for analyzing and understanding such graphs.Although these meticulously designed methods make progress,they are limited by model design and computational resources,making it difficult to scale to large-scale heterogeneous graph data and hindering the application and promotion of these methods.In this paper,we propose Restage,a relation structure-aware hierarchical heterogeneous graph embedding framework.Under this framework,embedding only a smaller-scale graph with existing graph representation learning methods is sufficient to obtain node representations on the original heterogeneous graph.We consider two types of relation structures in heterogeneous graphs:interaction relations and affiliation relations.Firstly,we design a relation structure-aware coarsening method to successively coarsen the original graph to the top-level layer,resulting in a smaller-scale graph.Secondly,we allow any unsupervised representation learning methods to obtain node embeddings on the top-level graph.Finally,we design a relation structure-aware refinement method to successively refine the node embeddings from the top-level graph back to the original graph,obtaining node embeddings on the original graph.Experimental results on three public heterogeneous graph datasets demonstrate the enhanced scalability of representation learning methods by the proposed Restage.On another large-scale graph,the speed of existing representation learning methods is increased by up to eighteen times at most.
基金supported by the National Natural Science Foundation of China(NSFC)(Grant Nos.62006167 and 62276177)the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD).
文摘Evidential Document-level Event Factuality Identification(EvDEFI)aims to predict the factual nature of an event and extract evidential sentences from the document precisely.Previous work usually limited to only predicting the factuality of an event with respect to a document,and neglected the interpretability of the task.As a more fine-grained and interpretable task,EvDEFI is still in the early stage.The existing model only used shallow similarity calculation to extract evidences,and employed simple attentions without lexical features,which is quite coarse-grained.Therefore,we propose a novel EvDEFI model named Heterogeneous and Extractive Graph Attention Network(HEGAT),which can update representations of events and sentences by multi-view graph attentions based on tokens and various lexical features from both local and global levels.Experiments on EB-DEF-v2 corpus demonstrate that HEGAT model is superior to several competitive baselines and can validate the interpretability of the task.
文摘In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.
基金supported in part by National Natural Science Foundation(61231008)Natural Science Foundation of Shannxi Province(2015JQ6248)+1 种基金National S&T Major Project(2012ZX03003005-005)the 111 Project (B08038)
文摘Efficient spectrum resource allocation in wireless heterogeneous networks is important for improving the system throughput and guaranteeing the user's Quality-of-Service(QoS).In this paper,we propose an enhanced algorithm for spectrum resource allocation in heterogeneous networks.First,the bandwidth of each user is determined by the user's rate demand and the channel state.Second,graph theory is enhanced and used to improve the spectrum efficiency.Third,spectrum resource is dynamically split between macrocell and femtocells with the changes of users' conditions.Our simulation results show that the proposed algorithm improves the system throughput significantly and also guarantees the fairness for the users.
文摘order to help investors understand the credit status of target corporations and reduce investment risks,the corporate credit rating model has become an important evaluation tool in the financial market.These models are based on statistical learning,machine learning and deep learning especially graph neural networks(GNNs).However,we found that only few models take the hierarchy,heterogeneity or unlabeled data into account in the actual corporate credit rating process.Therefore,we propose a novel framework named hierarchical heterogeneous graph neural networks(HHGNN),which can fully model the hierarchy of corporate features and the heterogeneity of relationships between corporations.In addition,we design an adversarial learning block to make full use of the rich unlabeled samples in the financial data.Extensive experiments conducted on the public-listed corporate rating dataset prove that HHGNN achieves SOTA compared to the baseline methods.
基金This work was supported by the National Key Research and Development Program of China under Grant No.2021YFB3100600the Youth Innovation Promotion Association of Chinese Academy of Sciences under Grant No.2021153the State Key Program of National Natural Science Foundation of China under Grant No.U2336202.
文摘Event detection(ED)seeks to recognize event triggers and classify them into the predefined event types.Chinese ED is formulated as a character-level task owing to the uncertain word boundaries.Prior methods try to incorpo-rate word-level information into characters to enhance their semantics.However,they experience two problems.First,they fail to incorporate word-level information into each character the word encompasses,causing the insufficient word-charac-ter interaction problem.Second,they struggle to distinguish events of similar types with limited annotated instances,which is called the event confusing problem.This paper proposes a novel model named Label-Aware Heterogeneous Graph Attention Network(L-HGAT)to address these two problems.Specifically,we first build a heterogeneous graph of two node types and three edge types to maximally preserve word-character interactions,and then deploy a heterogeneous graph attention network to enhance the semantic propagation between characters and words.Furthermore,we design a pushing-away game to enlarge the predicting gap between the ground-truth event type and its confusing counterpart for each character.Experimental results show that our L-HGAT model consistently achieves superior performance over prior competitive methods.
基金supported by the National Natural Science Foundation of China(Grant Nos.62276073,61966004)Guangxi Natural Science Foundation(No.2019GXNSFDA245018)+2 种基金Innovation Project of Guangxi Graduate Education(No.YCSW2022155)Guangxi“Bagui Scholar”Teams for Innovation and Research ProjectGuangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing.
文摘The deep learning methods based on syntactic dependency tree have achieved great success on Aspect-based Sentiment Analysis(ABSA).However,the accuracy of the dependency parser cannot be determined,which may keep aspect words away from its related opinion words in a dependency tree.Moreover,few models incorporate external affective knowledge for ABSA.Based on this,we propose a novel architecture to tackle the above two limitations,while fills up the gap in applying heterogeneous graphs convolution network to ABSA.Specially,we employ affective knowledge as an sentiment node to augment the representation of words.Then,linking sentiment node which have different attributes with word node through a specific edge to form a heterogeneous graph based on dependency tree.Finally,we design a multi-level semantic heterogeneous graph convolution network(Semantic-HGCN)to encode the heterogeneous graph for sentiment prediction.Extensive experiments are conducted on the datasets SemEval 2014 Task 4,SemEval 2015 task 12,SemEval 2016 task 5 and ACL 14 Twitter.The experimental results show that our method achieves the state-of-the-art performance.
文摘Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model(HGM) on electronic health record(EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network(CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.