Learning discriminative representations with deep neural networks often relies on massive labeled data, which is expensive and difficult to obtain in many real scenarios. As an alternative, self-supervised learning th...Learning discriminative representations with deep neural networks often relies on massive labeled data, which is expensive and difficult to obtain in many real scenarios. As an alternative, self-supervised learning that leverages input itself as supervision is strongly preferred for its soaring performance on visual representation learning. This paper introduces a contrastive self-supervised framework for learning generalizable representations on the synthetic data that can be obtained easily with complete controllability.Specifically, we propose to optimize a contrastive learning task and a physical property prediction task simultaneously. Given the synthetic scene, the first task aims to maximize agreement between a pair of synthetic images generated by our proposed view sampling module, while the second task aims to predict three physical property maps, i.e., depth, instance contour maps, and surface normal maps. In addition, a feature-level domain adaptation technique with adversarial training is applied to reduce the domain difference between the realistic and the synthetic data. Experiments demonstrate that our proposed method achieves state-of-the-art performance on several visual recognition datasets.展开更多
Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based o...Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based on graph convolutional network(GCN).Methods Clauses that contain symptoms,formulas,and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs,which were used to propose a node representation learning method based on GCN−the Traditional Chinese Medicine Graph Convolution Network(TCM-GCN).The symptom-formula,symptom-herb,and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes,and thus acquiring the nodes’sum-aggregations of symptoms,formulas,and herbs to lay a foundation for the downstream tasks of the prediction models.Results Comparisons among the node representations with multi-hot encoding,non-fusion encoding,and fusion encoding showed that the Precision@10,Recall@10,and F1-score@10 of the fusion encoding were 9.77%,6.65%,and 8.30%,respectively,higher than those of the non-fusion encoding in the prediction studies of the model.Conclusion Node representations by fusion encoding achieved comparatively ideal results,indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.展开更多
The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of netw...The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of network structure,diversity of network nodes,and sparsity of data all pose difficulties in predicting propagation.This paper proposes a malware propagation prediction model based on representation learning and Graph Convolutional Networks(GCN)to address the aforementioned problems.First,to solve the problem of the inaccuracy of infection intensity calculation caused by the sparsity of node interaction behavior data in the malware propagation network,a mechanism based on a tensor to mine the infection intensity among nodes is proposed to retain the network structure information.The influence of the relationship between nodes on the infection intensity is also analyzed.Second,given the diversity and complexity of the content and structure of infected and normal nodes in the network,considering the advantages of representation learning in data feature extraction,the corresponding representation learning method is adopted for the characteristics of infection intensity among nodes.This can efficiently calculate the relationship between entities and relationships in low dimensional space to achieve the goal of low dimensional,dense,and real-valued representation learning for the characteristics of propagation spatial data.We also design a new method,Tensor2vec,to learn the potential structural features of malware propagation.Finally,considering the convolution ability of GCN for non-Euclidean data,we propose a dynamic prediction model of malware propagation based on representation learning and GCN to solve the time effectiveness problem of the malware propagation carrier.The experimental results show that the proposed model can effectively predict the behaviors of the nodes in the network and discover the influence of different characteristics of nodes on the malware propagation situation.展开更多
Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur ...Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur combines factors such as larger ray sources,scattering and imaging system vibration.To address the problem,we propose DeblurTomo,a novel self-supervised learning-based deblurring and reconstruction algorithm that efficiently reconstructs sharp CT images from blurry input without needing external data and blur measurement.Specifically,we constructed a coordinate-based implicit neural representation reconstruction network,which can map the coordinates to the attenuation coefficient in the reconstructed space formore convenient ray representation.Then,wemodel the blur as aweighted sumof offset rays and design the RayCorrectionNetwork(RCN)andWeight ProposalNetwork(WPN)to fit these rays and their weights bymulti-view consistency and geometric information,thereby extending 2D deblurring to 3D space.In the training phase,we use the blurry input as the supervision signal to optimize the reconstruction network,the RCN,and the WPN simultaneously.Extensive experiments on the widely used synthetic dataset show that DeblurTomo performs superiorly on the limited-angle and sparse-view in the simulated blurred scenarios.Further experiments on real datasets demonstrate the superiority of our method in practical scenarios.展开更多
Satellite communication provides crucial connectivity for low-altitude aircraft to address coverage gaps in groundbased internet and aerial communication blind spots.Achieving this capability requires robust anomaly d...Satellite communication provides crucial connectivity for low-altitude aircraft to address coverage gaps in groundbased internet and aerial communication blind spots.Achieving this capability requires robust anomaly detection methods.However,existing approaches struggle to capture the complex spatio-temporal relationships in space-air communications due to challenges such as unpredictable network topology from real-time route adjustments,the absence of predefined communication link patterns,and rapid spatio-temporal relationship evolution among moving nodes.To address these challenges,this paper proposes a graph representation anomaly detection framework tailored for communication networks between low Earth orbit(LEO)satellite constellations and low-altitude aircraft.The spatio-temporal relationships between satellites and aircraft are modeled using dynamic graph structures,which capture 3D locations of each node and anomaly characteristics of each link.The proposed framework is compatible with diverse anomaly detection learning algorithms,and several static and dynamic detection algorithms are evaluated and compared in this work.Furthermore,authors present an improved variant of a novel transformer-based anomaly detection framework for dynamic graphs(TADDY),which involves temporal periodic encoding,semi-supervised comparative learning,and multiscale graph attention mechanisms.We evaluate the framework through simulated scenarios,comparing Walker and broken-chain constellations with varying air traffic densities in a specific region.The anomaly detection performances at different anomaly ratios and air traffic densities are evaluated.Experimental results demonstrate that,the performances of static anomaly detection methods and TADDY algorithm degrade significantly as air traffic density increases.Meanwhile,the proposed improved TADDY achieves an average AUC of 0.95 for Walker constellation and 0.86 for the broken-chain constellation,outperforming the original TADDY in both accuracy and reliability under high anomaly rates.Finally,sensitivity analysis and ablation studies confirm the framework's high responsiveness to anomalies such as abrupt topological changes,offering an efficient solution for ensuring the reliability of large-scale satellite-aircraft communication systems.展开更多
With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery ...With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery of knowledge based on their historical exercise sequences. Nowadays, many related works have emerged in this field, such as Bayesian knowledge tracing and deep knowledge tracing methods. Despite the progress that has been made in KT, existing techniques still have the following limitations: 1) Previous studies address KT by only exploring the observational sparsity data distribution, and the counterfactual data distribution has been largely ignored. 2) Current works designed for KT only consider either the entity relationships between questions and concepts, or the relations between two concepts, and none of them investigates the relations among students, questions, and concepts, simultaneously, leading to inaccurate student modeling. To address the above limitations,we propose a graph counterfactual augmentation method for knowledge tracing. Concretely, to consider the multiple relationships among different entities, we first uniform students, questions, and concepts in graphs, and then leverage a heterogeneous graph convolutional network to conduct representation learning.To model the counterfactual world, we conduct counterfactual transformations on students’ learning graphs by changing the corresponding treatments and then exploit the counterfactual outcomes in a contrastive learning framework. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed Graph CA method compared with several state-of-the-art baselines.展开更多
In view of the low interpretability of existing collaborative filtering recommendation algorithms and the difficulty of extracting information from content-based recommendation algorithms,we propose an efficient KGRS ...In view of the low interpretability of existing collaborative filtering recommendation algorithms and the difficulty of extracting information from content-based recommendation algorithms,we propose an efficient KGRS model.KGRS first obtains reasoning paths of knowledge graph and embeds the entities of paths into vectors based on knowledge representation learning TransD algorithm,then uses LSTM and soft attention mechanism to capture the semantic of each path reasoning,then uses convolution operation and pooling operation to distinguish the importance of different paths reasoning.Finally,through the full connection layer and sigmoid function to get the prediction ratings,and the items are sorted according to the prediction ratings to get the user’s recommendation list.KGRS is tested on the movielens-100k dataset.Compared with the related representative algorithm,including the state-of-the-art interpretable recommendation models RKGE and RippleNet,the experimental results show that KGRS has good recommendation interpretation and higher recommendation accuracy.展开更多
Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning mo...Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning model that enhances classification accuracy while minimizing reliance on extensive data collection.The proposed model integrates a hybrid similarity measure combining Euclidean distance and cosine similarity,effectively capturing both feature magnitude and directional relationships.This approach achieves a notable accuracy of 71.8%under a 5-way 5-shot evaluation,outperforming state-of-the-art models such as Prototypical Networks,FEAT,and ESPT by up to 10%.Notably,the model demonstrates high precision in classifying Siderastreidae(87.52%)and Fungiidae(88.95%),underscoring its effectiveness in distinguishing subtle morphological differences.To further enhance performance,we incorporate a self-supervised learning mechanism based on contrastive learning,enabling the model to extract robust representations by leveraging local structural patterns in corals.This enhancement significantly improves classification accuracy,particularly for species with high intra-class variation,leading to an overall accuracy of 76.52%under a 5-way 10-shot evaluation.Additionally,the model exploits the repetitive structures inherent in corals,introducing a local feature aggregation strategy that refines classification through spatial information integration.Beyond its technical contributions,this study presents a scalable and efficient approach for automated coral reef monitoring,reducing annotation costs while maintaining high classification accuracy.By improving few-shot learning performance in underwater environments,our model enhances monitoring accuracy by up to 15%compared to traditional methods,offering a practical solution for large-scale coral conservation efforts.展开更多
Revealing the latent low-dimensional geometric structure of high-dimensional data is a crucial task in unsupervised representation learning.Traditional manifold learning,as a typical method for discovering latent geom...Revealing the latent low-dimensional geometric structure of high-dimensional data is a crucial task in unsupervised representation learning.Traditional manifold learning,as a typical method for discovering latent geometric structures,has provided important nonlinear insight for the theoretical development of unsupervised representation learning.However,due to the shallow learning mechanism of the existing methods,they can only exploit the simple geometric structure embedded in the initial data,such as the local linear structure.Traditional manifold learning methods are fairly limited in mining higher-order nonlinear geometric information,which is also crucial for the development of unsupervised representation learning.To address the abovementioned limitations,this paper proposes a novel dynamic geometric structure learning model(DGSL)to explore the true latent nonlinear geometric structure.Specifically,by mathematically analysing the reconstruction loss function of manifold learning,we first provide universal geometric relational function between the curvature and the non-Euclidean metric of the initial data.Then,we leverage geometric flow to design a deeply iterative learning model to optimize this relational function.Our method can be viewed as a general-purpose algorithm for mining latent geometric structures,which can enhance the performance of geometric representation methods.Experimentally,we perform a set of representation learning tasks on several datasets.The experimental results show that our proposed method is superior to traditional methods.展开更多
Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information ...Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.展开更多
Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of t...Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.展开更多
Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts ...Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts are either based on fixed expert rules,which are inefficient,or rely on simplistic deep learning techniques that do not fully leverage contract semantic information.Therefore,there is ample room for improvement in terms of detection precision.To solve these problems,this paper proposes a vulnerability detector based on deep learning techniques,graph representation,and Transformer,called GRATDet.The method first performs swapping,insertion,and symbolization operations for contract functions,increasing the amount of small sample data.Each line of code is then treated as a basic semantic element,and information such as control and data relationships is extracted to construct a new representation in the form of a Line Graph(LG),which shows more structural features that differ from the serialized presentation of the contract.Finally,the node information and edge information of the graph are jointly learned using an improved Transformer-GP model to extract information globally and locally,and the fused features are used for vulnerability detection.The effectiveness of the method in reentrancy vulnerability detection is verified in experiments,where the F1 score reaches 95.16%,exceeding stateof-the-art methods.展开更多
Graph representation learning often faces knowledge scarcity in real-world applications,including limited labels and sparse relationships.Although a range of methods have been proposed to address these problems,such a...Graph representation learning often faces knowledge scarcity in real-world applications,including limited labels and sparse relationships.Although a range of methods have been proposed to address these problems,such as graph few-shot learning,they mainly rely on inadequate knowledge within the task graph,which would limit their effectiveness.Moreover,they fail to consider other potentially useful task-related graphs.To overcome these limitations,domain adaptation for graph representation learning has emerged as an effective paradigm for transferring knowledge across graphs.It is also recognized as graph domain adaptation(GDA).In particular,to enhance model performance on target graphs with specific tasks,GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from source graphs to the target graphs.Since GDA combines the advantages of graph representation learning and domain adaptation,it has become a promising direction of transfer learning on graphs and has attracted an increasing amount of research interest in recent years.In this paper,we comprehensively overview the studies of GDA and present a detailed survey of recent advances.Specifically,we outline the current research status,analyze key challenges,propose a taxonomy,introduce representative work and practical applications,and discuss future prospects.To the best of our knowledge,this paper is the first survey for graph domain adaptation.展开更多
Knowledge graph(KG)representation learning aims to map entities and relations into a low-dimensional representation space,showing significant potential in many tasks.Existing approaches follow two categories:(1)Graph-...Knowledge graph(KG)representation learning aims to map entities and relations into a low-dimensional representation space,showing significant potential in many tasks.Existing approaches follow two categories:(1)Graph-based approaches encode KG elements into vectors using structural score functions.(2)Text-based approaches embed text descriptions of entities and relations via pre-trained language models(PLMs),further fine-tuned with triples.We argue that graph-based approaches struggle with sparse data,while text-based approaches face challenges with complex relations.To address these limitations,we propose a unified Text-Augmented Attention-based Recurrent Network,bridging the gap between graph and natural language.Specifically,we employ a graph attention network based on local influence weights to model local structural information and utilize a PLM based prompt learning to learn textual information,enhanced by a mask-reconstruction strategy based on global influence weights and textual contrastive learning for improved robustness and generalizability.Besides,to effectively model multi-hop relations,we propose a novel semantic-depth guided path extraction algorithm and integrate cross-attention layers into recurrent neural networks to facilitate learning the long-term relation dependency and offer an adaptive attention mechanism for varied-length information.Extensive experiments demonstrate that our model exhibits superiority over existing models across KG completion and question-answering tasks.展开更多
Advanced Persistent Threats (APTs) achieves internal networks penetration through multiple methods, making it difcult to detect attack clues solely through boundary defense measures. To address this challenge, some re...Advanced Persistent Threats (APTs) achieves internal networks penetration through multiple methods, making it difcult to detect attack clues solely through boundary defense measures. To address this challenge, some research has proposed threat detection methods based on provenance graphs, which leverage entity relationships such as processes, fles, and sockets found in host audit logs. However, these methods are generally inefcient, especially when faced with massive audit logs and the computational resource-intensive nature of graph algorithms. Efec-tivelyand economically extracting APT attack clues from massive system audit logs remains a signifcant challenge. To tackle this problem, this paper introduces the ProcSAGE method, which detects threats based on abnormal behavior patterns, ofering high accuracy, low cost, and independence from expert knowledge. ProcSAGE focuses on processes or threads in host audit logs during the graph construction phase to efectively control the scale of provenance graphs and reduce performance overhead. Additionally, in the feature extraction phase, ProcSAGE considers information about the processes or threads themselves and their neighboring nodes to accurately char-acterizethem and enhance model accuracy. In order to verify the efectiveness of the ProcSAGE method, this study conducted a comprehensive evaluation on the StreamSpot dataset. The experimental results show that the ProcSAGE method can signifcantly reduce the time and memory consumption in the threat detection process while improving the accuracy, and the optimization efect becomes more signifcant as the data size expands.展开更多
Predicting mortality risk in the Intensive Care Unit(ICU)using Electronic Medical Records(EMR)is crucial for identifying patients in need of immediate attention.However,the incompleteness and the variability of EMR fe...Predicting mortality risk in the Intensive Care Unit(ICU)using Electronic Medical Records(EMR)is crucial for identifying patients in need of immediate attention.However,the incompleteness and the variability of EMR features for each patient make mortality prediction challenging.This study proposes a multimodal representation learning framework based on a novel personalized graph-based fusion approach to address these challenges.The proposed approach involves constructing patient-specific modality aggregation graphs to provide information about the features associated with each patient from incomplete multimodal data,enabling the effective and explainable fusion of the incomplete features.Modality-specific encoders are employed to encode each modality feature separately.To tackle the variability and incompleteness of input features among patients,a novel personalized graph-based fusion method is proposed to fuse patient-specific multimodal feature representations based on the constructed modality aggregation graphs.Furthermore,a MultiModal Gated Contrastive Representation Learning(MMGCRL)method is proposed to facilitate capturing adequate complementary information from multimodal representations and improve model performance.We evaluate the proposed framework using the large-scale ICU dataset,MIMIC-III.Experimental results demonstrate its effectiveness in mortality prediction,outperforming several state-of-the-art methods.展开更多
The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable chall...The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable challenge to cybersecurity. Traditional machine learning and deep learning techniques often fall short in identifying encrypted malicious traffic due to their inability to fully extract and utilize the implicit relational and positional information embedded within data packets. This limitation has led to an unresolved challenge in the cybersecurity community: how to effectively extract valuable insights from the complex patterns of traffic packet transmission. Consequently, this paper introduces the TB-Graph model, an encrypted malicious traffic classification model based on a relational graph attention network. The model is a heterogeneous traffic burst graph that embeds side-channel features, which are unaffected by encryption, into the graph nodes and connects them with three different types of burst edges. Subsequently, we design a relational positional coding that prevents the loss of temporal relationships between the original traffic flows during graph transformation. Ultimately, TB-Graph leverages the powerful graph representation learning capabilities of Relational Graph Attention Network (RGAT) to extract latent behavioral features from the burst graph nodes and edge relationships. Experimental results show that TB-Graph outperforms various state-of-the-art methods in fine-grained encrypted malicious traffic classification tasks on two public datasets, indicating its enhanced capability for identifying encrypted malicious traffic.展开更多
基金by National Natural Science Foundation of China(Nos.61822204 and 61521002).
文摘Learning discriminative representations with deep neural networks often relies on massive labeled data, which is expensive and difficult to obtain in many real scenarios. As an alternative, self-supervised learning that leverages input itself as supervision is strongly preferred for its soaring performance on visual representation learning. This paper introduces a contrastive self-supervised framework for learning generalizable representations on the synthetic data that can be obtained easily with complete controllability.Specifically, we propose to optimize a contrastive learning task and a physical property prediction task simultaneously. Given the synthetic scene, the first task aims to maximize agreement between a pair of synthetic images generated by our proposed view sampling module, while the second task aims to predict three physical property maps, i.e., depth, instance contour maps, and surface normal maps. In addition, a feature-level domain adaptation technique with adversarial training is applied to reduce the domain difference between the realistic and the synthetic data. Experiments demonstrate that our proposed method achieves state-of-the-art performance on several visual recognition datasets.
基金New-Generation Artificial Intelligence-Major Program in the Sci-Tech Innovation 2030 Agenda from the Ministry of Science and Technology of China(2018AAA0102100)Hunan Provincial Department of Education key project(21A0250)The First Class Discipline Open Fund of Hunan University of Traditional Chinese Medicine(2022ZYX08)。
文摘Objective To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases(Shang Han Lun,《伤寒论》)dataset and explore an optimal learning method represented with node attributes based on graph convolutional network(GCN).Methods Clauses that contain symptoms,formulas,and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs,which were used to propose a node representation learning method based on GCN−the Traditional Chinese Medicine Graph Convolution Network(TCM-GCN).The symptom-formula,symptom-herb,and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes,and thus acquiring the nodes’sum-aggregations of symptoms,formulas,and herbs to lay a foundation for the downstream tasks of the prediction models.Results Comparisons among the node representations with multi-hot encoding,non-fusion encoding,and fusion encoding showed that the Precision@10,Recall@10,and F1-score@10 of the fusion encoding were 9.77%,6.65%,and 8.30%,respectively,higher than those of the non-fusion encoding in the prediction studies of the model.Conclusion Node representations by fusion encoding achieved comparatively ideal results,indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.
基金This research is partially supported by the National Natural Science Foundation of China(Grant No.61772098)Chongqing Technology Innovation and Application Development Project(Grant No.cstc2020jscxmsxmX0150)+2 种基金Chongqing Science and Technology Innovation Leading Talent Support Program(CSTCCXLJRC201908)Basic and Advanced Research Projects of CSTC(No.cstc2019jcyj-zdxmX0008)Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJZD-K201900605).
文摘The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of network structure,diversity of network nodes,and sparsity of data all pose difficulties in predicting propagation.This paper proposes a malware propagation prediction model based on representation learning and Graph Convolutional Networks(GCN)to address the aforementioned problems.First,to solve the problem of the inaccuracy of infection intensity calculation caused by the sparsity of node interaction behavior data in the malware propagation network,a mechanism based on a tensor to mine the infection intensity among nodes is proposed to retain the network structure information.The influence of the relationship between nodes on the infection intensity is also analyzed.Second,given the diversity and complexity of the content and structure of infected and normal nodes in the network,considering the advantages of representation learning in data feature extraction,the corresponding representation learning method is adopted for the characteristics of infection intensity among nodes.This can efficiently calculate the relationship between entities and relationships in low dimensional space to achieve the goal of low dimensional,dense,and real-valued representation learning for the characteristics of propagation spatial data.We also design a new method,Tensor2vec,to learn the potential structural features of malware propagation.Finally,considering the convolution ability of GCN for non-Euclidean data,we propose a dynamic prediction model of malware propagation based on representation learning and GCN to solve the time effectiveness problem of the malware propagation carrier.The experimental results show that the proposed model can effectively predict the behaviors of the nodes in the network and discover the influence of different characteristics of nodes on the malware propagation situation.
基金supported in part by the National Natural Science Foundation of China under Grants 62472434 and 62402171in part by the National Key Research and Development Program of China under Grant 2022YFF1203001+1 种基金in part by the Science and Technology Innovation Program of Hunan Province under Grant 2022RC3061in part by the Sci-Tech Innovation 2030 Agenda under Grant 2023ZD0508600.
文摘Computed Tomography(CT)reconstruction is essential inmedical imaging and other engineering fields.However,blurring of the projection during CT imaging can lead to artifacts in the reconstructed images.Projection blur combines factors such as larger ray sources,scattering and imaging system vibration.To address the problem,we propose DeblurTomo,a novel self-supervised learning-based deblurring and reconstruction algorithm that efficiently reconstructs sharp CT images from blurry input without needing external data and blur measurement.Specifically,we constructed a coordinate-based implicit neural representation reconstruction network,which can map the coordinates to the attenuation coefficient in the reconstructed space formore convenient ray representation.Then,wemodel the blur as aweighted sumof offset rays and design the RayCorrectionNetwork(RCN)andWeight ProposalNetwork(WPN)to fit these rays and their weights bymulti-view consistency and geometric information,thereby extending 2D deblurring to 3D space.In the training phase,we use the blurry input as the supervision signal to optimize the reconstruction network,the RCN,and the WPN simultaneously.Extensive experiments on the widely used synthetic dataset show that DeblurTomo performs superiorly on the limited-angle and sparse-view in the simulated blurred scenarios.Further experiments on real datasets demonstrate the superiority of our method in practical scenarios.
基金supported by the Hong Kong-Macao-Taiwan Science and Technology Cooperation Project of the Science and Technology Innovation Action Plan in Shanghai(Grant No.23510760200)the Oriental Talent Youth Program of Shanghai(Grant No.Y3DFRCZL01)+4 种基金the Outstanding Program of the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Grant No.Y2023080)the Strategic Priority Research Program of the Chinese Academy of Sciences(Category A)(Grant No.XDA0360404)the MYHT Program of China(Grant No.D030312)the Shanghai Pujiang Program(Grant No.23pjd092)the Shanghai Pilot Program for Basic Research–Chinese Academy of Science,Shanghai Branch(Grant No.JCYJ-SHFY-2022-015)。
文摘Satellite communication provides crucial connectivity for low-altitude aircraft to address coverage gaps in groundbased internet and aerial communication blind spots.Achieving this capability requires robust anomaly detection methods.However,existing approaches struggle to capture the complex spatio-temporal relationships in space-air communications due to challenges such as unpredictable network topology from real-time route adjustments,the absence of predefined communication link patterns,and rapid spatio-temporal relationship evolution among moving nodes.To address these challenges,this paper proposes a graph representation anomaly detection framework tailored for communication networks between low Earth orbit(LEO)satellite constellations and low-altitude aircraft.The spatio-temporal relationships between satellites and aircraft are modeled using dynamic graph structures,which capture 3D locations of each node and anomaly characteristics of each link.The proposed framework is compatible with diverse anomaly detection learning algorithms,and several static and dynamic detection algorithms are evaluated and compared in this work.Furthermore,authors present an improved variant of a novel transformer-based anomaly detection framework for dynamic graphs(TADDY),which involves temporal periodic encoding,semi-supervised comparative learning,and multiscale graph attention mechanisms.We evaluate the framework through simulated scenarios,comparing Walker and broken-chain constellations with varying air traffic densities in a specific region.The anomaly detection performances at different anomaly ratios and air traffic densities are evaluated.Experimental results demonstrate that,the performances of static anomaly detection methods and TADDY algorithm degrade significantly as air traffic density increases.Meanwhile,the proposed improved TADDY achieves an average AUC of 0.95 for Walker constellation and 0.86 for the broken-chain constellation,outperforming the original TADDY in both accuracy and reliability under high anomaly rates.Finally,sensitivity analysis and ablation studies confirm the framework's high responsiveness to anomalies such as abrupt topological changes,offering an efficient solution for ensuring the reliability of large-scale satellite-aircraft communication systems.
基金supported by the Natural Science Foundation of China (62372277)the Natural Science Foundation of Shandong Province (ZR2022MF257, ZR2022MF295)Humanities and Social Sciences Fund of the Ministry of Education (21YJC630157)。
文摘With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery of knowledge based on their historical exercise sequences. Nowadays, many related works have emerged in this field, such as Bayesian knowledge tracing and deep knowledge tracing methods. Despite the progress that has been made in KT, existing techniques still have the following limitations: 1) Previous studies address KT by only exploring the observational sparsity data distribution, and the counterfactual data distribution has been largely ignored. 2) Current works designed for KT only consider either the entity relationships between questions and concepts, or the relations between two concepts, and none of them investigates the relations among students, questions, and concepts, simultaneously, leading to inaccurate student modeling. To address the above limitations,we propose a graph counterfactual augmentation method for knowledge tracing. Concretely, to consider the multiple relationships among different entities, we first uniform students, questions, and concepts in graphs, and then leverage a heterogeneous graph convolutional network to conduct representation learning.To model the counterfactual world, we conduct counterfactual transformations on students’ learning graphs by changing the corresponding treatments and then exploit the counterfactual outcomes in a contrastive learning framework. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed Graph CA method compared with several state-of-the-art baselines.
基金supported by the National Science Foundation of China Grant No.61762092“Dynamic multi-objective requirement optimization based on transfer learning”,No.61762089+2 种基金“The key research of high order tensor decomposition in distributed environment”the Open Foundation of the Key Laboratory in Software Engineering of Yunnan Province,Grant No.2017SE204,”Research on extracting software feature models using transfer learning”.
文摘In view of the low interpretability of existing collaborative filtering recommendation algorithms and the difficulty of extracting information from content-based recommendation algorithms,we propose an efficient KGRS model.KGRS first obtains reasoning paths of knowledge graph and embeds the entities of paths into vectors based on knowledge representation learning TransD algorithm,then uses LSTM and soft attention mechanism to capture the semantic of each path reasoning,then uses convolution operation and pooling operation to distinguish the importance of different paths reasoning.Finally,through the full connection layer and sigmoid function to get the prediction ratings,and the items are sorted according to the prediction ratings to get the user’s recommendation list.KGRS is tested on the movielens-100k dataset.Compared with the related representative algorithm,including the state-of-the-art interpretable recommendation models RKGE and RippleNet,the experimental results show that KGRS has good recommendation interpretation and higher recommendation accuracy.
基金funded by theNational Science and TechnologyCouncil(NSTC),Taiwan,under grant numbers NSTC 112-2634-F-019-001 and NSTC 113-2634-F-A49-007.
文摘Few-shot learning has emerged as a crucial technique for coral species classification,addressing the challenge of limited labeled data in underwater environments.This study introduces an optimized few-shot learning model that enhances classification accuracy while minimizing reliance on extensive data collection.The proposed model integrates a hybrid similarity measure combining Euclidean distance and cosine similarity,effectively capturing both feature magnitude and directional relationships.This approach achieves a notable accuracy of 71.8%under a 5-way 5-shot evaluation,outperforming state-of-the-art models such as Prototypical Networks,FEAT,and ESPT by up to 10%.Notably,the model demonstrates high precision in classifying Siderastreidae(87.52%)and Fungiidae(88.95%),underscoring its effectiveness in distinguishing subtle morphological differences.To further enhance performance,we incorporate a self-supervised learning mechanism based on contrastive learning,enabling the model to extract robust representations by leveraging local structural patterns in corals.This enhancement significantly improves classification accuracy,particularly for species with high intra-class variation,leading to an overall accuracy of 76.52%under a 5-way 10-shot evaluation.Additionally,the model exploits the repetitive structures inherent in corals,introducing a local feature aggregation strategy that refines classification through spatial information integration.Beyond its technical contributions,this study presents a scalable and efficient approach for automated coral reef monitoring,reducing annotation costs while maintaining high classification accuracy.By improving few-shot learning performance in underwater environments,our model enhances monitoring accuracy by up to 15%compared to traditional methods,offering a practical solution for large-scale coral conservation efforts.
基金supported in part by the Young Elite Scientists Sponsorship Program by China Association for Science and Technology(2022QNRC001)the National Natural Science Foundation of China(62406315)+2 种基金China Postdoctoral Science Foundation(2025M771504)GuangDong Basic and Applied Basic Research Foundation(2024A1515110108)Shaanxi Provincial Key Research and Development Program(2025SF-YBXM-023).
文摘Revealing the latent low-dimensional geometric structure of high-dimensional data is a crucial task in unsupervised representation learning.Traditional manifold learning,as a typical method for discovering latent geometric structures,has provided important nonlinear insight for the theoretical development of unsupervised representation learning.However,due to the shallow learning mechanism of the existing methods,they can only exploit the simple geometric structure embedded in the initial data,such as the local linear structure.Traditional manifold learning methods are fairly limited in mining higher-order nonlinear geometric information,which is also crucial for the development of unsupervised representation learning.To address the abovementioned limitations,this paper proposes a novel dynamic geometric structure learning model(DGSL)to explore the true latent nonlinear geometric structure.Specifically,by mathematically analysing the reconstruction loss function of manifold learning,we first provide universal geometric relational function between the curvature and the non-Euclidean metric of the initial data.Then,we leverage geometric flow to design a deeply iterative learning model to optimize this relational function.Our method can be viewed as a general-purpose algorithm for mining latent geometric structures,which can enhance the performance of geometric representation methods.Experimentally,we perform a set of representation learning tasks on several datasets.The experimental results show that our proposed method is superior to traditional methods.
文摘Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.
基金support by the National Natural Science Foundation of China(NSFC)under grant number 61873274.
文摘Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.
基金supported by the Science and Technology Program Project(No.2020A02001-1)of Xinjiang Autonomous Region,China.
文摘Smart contracts have led to more efficient development in finance and healthcare,but vulnerabilities in contracts pose high risks to their future applications.The current vulnerability detection methods for contracts are either based on fixed expert rules,which are inefficient,or rely on simplistic deep learning techniques that do not fully leverage contract semantic information.Therefore,there is ample room for improvement in terms of detection precision.To solve these problems,this paper proposes a vulnerability detector based on deep learning techniques,graph representation,and Transformer,called GRATDet.The method first performs swapping,insertion,and symbolization operations for contract functions,increasing the amount of small sample data.Each line of code is then treated as a basic semantic element,and information such as control and data relationships is extracted to construct a new representation in the form of a Line Graph(LG),which shows more structural features that differ from the serialized presentation of the contract.Finally,the node information and edge information of the graph are jointly learned using an improved Transformer-GP model to extract information globally and locally,and the fused features are used for vulnerability detection.The effectiveness of the method in reentrancy vulnerability detection is verified in experiments,where the F1 score reaches 95.16%,exceeding stateof-the-art methods.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(CAS)under Grant No.XDB0680302the National Key Research and Development Program of China under Grant No.2023YFC3305303+2 种基金the National Natural Science Foundation of China under Grant Nos.62372434 and 62302485the China Postdoctoral Science Foundation under Grant No.2022M713206the CAS Special Research Assistant Program,and the Key Research Project of Chinese Academy of Sciences under Grant No.RCJJ-145-24-21.
文摘Graph representation learning often faces knowledge scarcity in real-world applications,including limited labels and sparse relationships.Although a range of methods have been proposed to address these problems,such as graph few-shot learning,they mainly rely on inadequate knowledge within the task graph,which would limit their effectiveness.Moreover,they fail to consider other potentially useful task-related graphs.To overcome these limitations,domain adaptation for graph representation learning has emerged as an effective paradigm for transferring knowledge across graphs.It is also recognized as graph domain adaptation(GDA).In particular,to enhance model performance on target graphs with specific tasks,GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from source graphs to the target graphs.Since GDA combines the advantages of graph representation learning and domain adaptation,it has become a promising direction of transfer learning on graphs and has attracted an increasing amount of research interest in recent years.In this paper,we comprehensively overview the studies of GDA and present a detailed survey of recent advances.Specifically,we outline the current research status,analyze key challenges,propose a taxonomy,introduce representative work and practical applications,and discuss future prospects.To the best of our knowledge,this paper is the first survey for graph domain adaptation.
基金supported in part by National Key R&D Program of China(2020AAA0108501).
文摘Knowledge graph(KG)representation learning aims to map entities and relations into a low-dimensional representation space,showing significant potential in many tasks.Existing approaches follow two categories:(1)Graph-based approaches encode KG elements into vectors using structural score functions.(2)Text-based approaches embed text descriptions of entities and relations via pre-trained language models(PLMs),further fine-tuned with triples.We argue that graph-based approaches struggle with sparse data,while text-based approaches face challenges with complex relations.To address these limitations,we propose a unified Text-Augmented Attention-based Recurrent Network,bridging the gap between graph and natural language.Specifically,we employ a graph attention network based on local influence weights to model local structural information and utilize a PLM based prompt learning to learn textual information,enhanced by a mask-reconstruction strategy based on global influence weights and textual contrastive learning for improved robustness and generalizability.Besides,to effectively model multi-hop relations,we propose a novel semantic-depth guided path extraction algorithm and integrate cross-attention layers into recurrent neural networks to facilitate learning the long-term relation dependency and offer an adaptive attention mechanism for varied-length information.Extensive experiments demonstrate that our model exhibits superiority over existing models across KG completion and question-answering tasks.
基金supported by National Key Research and Development Pro-gram of China(No.2023YFC2206402)Youth Innovation Promotion Associa-tion CAS(No.2021156)+2 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDC02040100)Foundation Strengthening Program Technical Area Fund,021-JCJQ-JJ-0908State Grid Corporation of China Science and Technology Program(Contract No.:SG270000YXJS2311060).
文摘Advanced Persistent Threats (APTs) achieves internal networks penetration through multiple methods, making it difcult to detect attack clues solely through boundary defense measures. To address this challenge, some research has proposed threat detection methods based on provenance graphs, which leverage entity relationships such as processes, fles, and sockets found in host audit logs. However, these methods are generally inefcient, especially when faced with massive audit logs and the computational resource-intensive nature of graph algorithms. Efec-tivelyand economically extracting APT attack clues from massive system audit logs remains a signifcant challenge. To tackle this problem, this paper introduces the ProcSAGE method, which detects threats based on abnormal behavior patterns, ofering high accuracy, low cost, and independence from expert knowledge. ProcSAGE focuses on processes or threads in host audit logs during the graph construction phase to efectively control the scale of provenance graphs and reduce performance overhead. Additionally, in the feature extraction phase, ProcSAGE considers information about the processes or threads themselves and their neighboring nodes to accurately char-acterizethem and enhance model accuracy. In order to verify the efectiveness of the ProcSAGE method, this study conducted a comprehensive evaluation on the StreamSpot dataset. The experimental results show that the ProcSAGE method can signifcantly reduce the time and memory consumption in the threat detection process while improving the accuracy, and the optimization efect becomes more signifcant as the data size expands.
基金supported by the National Natural Science Foundation of China(No.U24A20256)and the Science and Technology Major Project of Changsha(No.kh2402004).
文摘Predicting mortality risk in the Intensive Care Unit(ICU)using Electronic Medical Records(EMR)is crucial for identifying patients in need of immediate attention.However,the incompleteness and the variability of EMR features for each patient make mortality prediction challenging.This study proposes a multimodal representation learning framework based on a novel personalized graph-based fusion approach to address these challenges.The proposed approach involves constructing patient-specific modality aggregation graphs to provide information about the features associated with each patient from incomplete multimodal data,enabling the effective and explainable fusion of the incomplete features.Modality-specific encoders are employed to encode each modality feature separately.To tackle the variability and incompleteness of input features among patients,a novel personalized graph-based fusion method is proposed to fuse patient-specific multimodal feature representations based on the constructed modality aggregation graphs.Furthermore,a MultiModal Gated Contrastive Representation Learning(MMGCRL)method is proposed to facilitate capturing adequate complementary information from multimodal representations and improve model performance.We evaluate the proposed framework using the large-scale ICU dataset,MIMIC-III.Experimental results demonstrate its effectiveness in mortality prediction,outperforming several state-of-the-art methods.
文摘The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable challenge to cybersecurity. Traditional machine learning and deep learning techniques often fall short in identifying encrypted malicious traffic due to their inability to fully extract and utilize the implicit relational and positional information embedded within data packets. This limitation has led to an unresolved challenge in the cybersecurity community: how to effectively extract valuable insights from the complex patterns of traffic packet transmission. Consequently, this paper introduces the TB-Graph model, an encrypted malicious traffic classification model based on a relational graph attention network. The model is a heterogeneous traffic burst graph that embeds side-channel features, which are unaffected by encryption, into the graph nodes and connects them with three different types of burst edges. Subsequently, we design a relational positional coding that prevents the loss of temporal relationships between the original traffic flows during graph transformation. Ultimately, TB-Graph leverages the powerful graph representation learning capabilities of Relational Graph Attention Network (RGAT) to extract latent behavioral features from the burst graph nodes and edge relationships. Experimental results show that TB-Graph outperforms various state-of-the-art methods in fine-grained encrypted malicious traffic classification tasks on two public datasets, indicating its enhanced capability for identifying encrypted malicious traffic.