Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the pro...Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.展开更多
The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced met...The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced metering infrastructure services.However,this digital transformation also exposes power system to evolving threats,ranging from cyber intrusions and electricity theft to device malfunctions,and the unpredictable nature of these anomalies,coupled with the scarcity of labeled fault data,makes realtime detection exceptionally challenging.To address these difficulties,a real-time decision support framework is presented for smart meter anomality detection that leverages rolling time windows and two self-supervised contrastive learning modules.The first module synthesizes diverse negative samples to overcome the lack of labeled anomalies,while the second captures intrinsic temporal patterns for enhanced contextual discrimination.The end-to-end framework continuously updates its model with rolling updated meter data to deliver timely identification of emerging abnormal behaviors in evolving grids.Extensive evaluations on eight publicly available smart meter datasets over seven diverse abnormal patterns testing demonstrate the effectiveness of the proposed full framework,achieving average recall and F1 score of more than 0.85.展开更多
Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often resu...Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often results in translated content that diverges significantly from the source text.This research proposes a refined Contrastive Decoding(CD)algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in lowresource NMT and improve translation quality.Advanced large language NMT models,including ChatGLM and LLaMA,are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities.The refined CD algorithm evaluates multiple candidate translations using BLEU score,semantic similarity,and Named Entity Recognition accuracy.Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates.Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models.An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations.Notably,the refined methodology increased the BLEU score by approximately 30%compared to baseline models.展开更多
Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing ...Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).展开更多
Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion...Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion recognition approaches often struggle in few-shot cross-domain scenarios due to their limited capacity to generalize semantic features across different domains. Additionally, these methods face challenges in accurately capturing complex emotional states, particularly those that are subtle or implicit. To overcome these limitations, we introduce a novel approach called Dual-Task Contrastive Meta-Learning (DTCML). This method combines meta-learning and contrastive learning to improve emotion recognition. Meta-learning enhances the model’s ability to generalize to new emotional tasks, while instance contrastive learning further refines the model by distinguishing unique features within each category, enabling it to better differentiate complex emotional expressions. Prototype contrastive learning, in turn, helps the model address the semantic complexity of emotions across different domains, enabling the model to learn fine-grained emotions expression. By leveraging dual tasks, DTCML learns from two domains simultaneously, the model is encouraged to learn more diverse and generalizable emotions features, thereby improving its cross-domain adaptability and robustness, and enhancing its generalization ability. We evaluated the performance of DTCML across four cross-domain settings, and the results show that our method outperforms the best baseline by 5.88%, 12.04%, 8.49%, and 8.40% in terms of accuracy.展开更多
Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure ...Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure challenges in edge environments.However,the diversity of clients in edge cloud computing presents significant challenges for FL.Personalized federated learning(pFL)received considerable attention in recent years.One example of pFL involves exploiting the global and local information in the local model.Current pFL algorithms experience limitations such as slow convergence speed,catastrophic forgetting,and poor performance in complex tasks,which still have significant shortcomings compared to the centralized learning.To achieve high pFL performance,we propose FedCLCC:Federated Contrastive Learning and Conditional Computing.The core of FedCLCC is the use of contrastive learning and conditional computing.Contrastive learning determines the feature representation similarity to adjust the local model.Conditional computing separates the global and local information and feeds it to their corresponding heads for global and local handling.Our comprehensive experiments demonstrate that FedCLCC outperforms other state-of-the-art FL algorithms.展开更多
Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image d...Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image distortions,which pose challenges for object detection tasks.To address this issue,we propose Yolo-CaSKA(Yolo with Contrastive Learning and Selective Kernel Attention),a novel training method that enhances object detection on fisheye camera images.The standard image and the corresponding distorted fisheye image pairs are used as positive samples,and the rest of the image pairs are used as negative samples,which are guided by contrastive learning to help the distorted images find the feature vectors of the corresponding normal images,to improve the detection accuracy.Additionally,we incorporate the Selective Kernel(SK)attention module to focus on regions prone to false detections,such as image edges and blind spots.Finally,the mAP_(50) on the augmented KITTI dataset is improved by 5.5% over the original Yolov8,while the mAP_(50) on the WoodScape dataset is improved by 2.6% compared to OmniDet.The results demonstrate the performance of our proposed model for object detection on fisheye images.展开更多
Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,t...Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,the volume of trustable information far exceeds that of rumors,resulting in a class imbalance that leads models to prioritize the majority class during training.This focus diminishes the model’s ability to recognize minority class samples.Furthermore,models may experience overfitting when encountering these minority samples,further compromising their generalization capabilities.Unlike node-level classification tasks,fake news detection in social networks operates on graph-level samples,where traditional interpolation and oversampling methods struggle to effectively generate high-quality graph-level samples.This challenge complicates the identification of new instances of false information.To address this issue,this paper introduces the FHGraph(Fake News Hunting Graph)framework,which employs a generative data augmentation approach and a latent diffusion model to create graph structures that align with news communication patterns.Using the few-sample learning capabilities of large language models(LLMs),the framework generates diverse texts for minority class nodes.FHGraph comprises a hierarchical multiview graph contrastive learning module,in which two horizontal views and three vertical levels are utilized for self-supervised learning,resulting in more optimized representations.Experimental results show that FHGraph significantly outperforms state-of-the-art(SOTA)graph-level class imbalance methods and SOTA graph-level contrastive learning methods.Specifically,FHGraph has achieved a 2%increase in F1 Micro and a 2.5%increase in F1 Macro in the PHEME dataset,as well as a 3.5%improvement in F1 Micro and a 4.3%improvement in F1 Macro on RumorEval dataset.展开更多
In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can...In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can provide a more reliable approach in these situations.Current popular approaches mainly adopt the classification-based class activation maps(CAM)as initial pseudo labels to solve the task.展开更多
Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world appli...Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world applications.Few-shot object detection presents a new research idea that aims to localize and classify objects in images using only limited annotated examples.However,the inherent challenge in few-shot object detection lies in the insufficient sample diversity to fully characterize the sample feature distribution,which consequently impacts model performance.Inspired by contrastive learning principles,we propose an Implicit Feature Contrastive Learning(IFCL)module to address this limitation and augment feature diversity for more robust representational learning.This module generates augmented support sample features in a mixed feature space and implicitly contrasts them with query Region of Interest(RoI)features.This approach facilitates more comprehensive learning of both intra-class feature similarity and inter-class feature diversity,thereby enhancing the model’s object classification and localization capabilities.Extensive experiments on PASCAL VOC show that our method achieves a respective improvement of 3.2%,1.8%,and 2.3%on 10-shot of three Novel Sets compared to the baseline model FPD.展开更多
Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often l...Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often lead to yield inaccurate features of influential users due to neighborhood aggregation,and require a large substantial amount of labeled data for training,making them difficult and challenging to apply in practice.To address this issue,we propose a semi-supervised contrastive learning method for identifying influential users.First,the proposed method constructs positive and negative samples for contrastive learning based on multiple node centrality metrics related to influence;then,contrastive learning is employed to guide the encoder to generate various influence-related features for users;finally,with only a small amount of labeled data,an attention-based user classifier is trained to accurately identify influential users.Experiments conducted on three public social network datasets demonstrate that the proposed method,using only 20%of the labeled data as the training set,achieves F1 values that are 5.9%,5.8%,and 8.7%higher than those unsupervised EVC method,and it matches the performance of GNN-based methods such as DeepInf,InfGCN and OlapGN,which require 80%of labeled data as the training set.展开更多
Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph aug...Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph augmentation strategies,which can harm the semantic and structural information of graphs and overlook the rich structural information present in subgraphs.To address these issues,we propose a graph similarity learning model based on learnable augmentation and multi-level contrastive learning.First,to tackle the problem of random augmentation disrupting the semantics and structure of the graph,we design a learnable augmentation method to selectively choose nodes and edges within the graph.To enhance contrastive levels,we employ a biased random walk method to generate corresponding subgraphs,enriching the contrastive hierarchy.Second,to solve the issue of previous work not considering multi-level contrastive learning,we utilize graph convolutional networks to learn node representations of augmented views and the original graph and calculate the interaction information between the attribute-augmented and structure-augmented views and the original graph.The goal is to maximize node consistency between different views and learn node matching between different graphs,resulting in node-level representations for each graph.Subgraph representations are then obtained through pooling operations,and we conduct contrastive learning utilizing both node and subgraph representations.Finally,the graph similarity score is computed according to different downstream tasks.We conducted three sets of experiments across eight datasets,and the results demonstrate that the proposed model effectively mitigates the issues of random augmentation damaging the original graph’s semantics and structure,as well as the insufficiency of contrastive levels.Additionally,the model achieves the best overall performance.展开更多
The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compare...The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.展开更多
Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of t...Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of the existing research only focused on public databases with channel systems incompatible with the current clinical measurements.To narrow the gap between theoretical models and real clinical practice,we propose a novel deep learning model,by combining the vision transformer with supervised contrastive learning,realizing the efficient sleep stages classification.Experimental results show that the model facilitates an easier classification of multi-channel PSG signals.The mean F1-scores of 79.2%and 76.5%on two public databases outperform the previous studies,showing the model’s great capability,and the performance of the proposed method on the children’s small database also presents a high mean accuracy of 88.6%.Our proposed model is validated not only on the public databases but the provided clinical database to strictly evaluate its clinical usage in practice.展开更多
Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the nove...Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the novel classes.Due to imbalanced training data,existing FS3D methods based on fully supervised learning can lead to overfitting toward base classes,which impairs the network’s ability to generalize knowledge learned from base classes to novel classes and also prevents the network from extracting distinctive foreground and background representations for novel class objects.To address these issues,this thesis proposes a category-agnostic contrastive learning approach,enhancing the generalization and identification abilities for almost unseen categories through the construction of pseudo-labels and positive-negative sample pairs unrelated to specific classes.Firstly,this thesis designs a proposal-wise context contrastive module(CCM).By reducing the distance between foreground point features and increasing the distance between foreground and background point features within a region proposal,CCM aids the network in extracting more discriminative foreground and background feature representations without reliance on categorical annotations.Secondly,this thesis utilizes a geometric contrastive module(GCM),which enhances the network’s geometric perception capability by employing contrastive learning on the foreground point features associated with various basic geometric components,such as edges,corners,and surfaces,thereby enabling these geometric components to exhibit more distinguishable representations.This thesis also combines category-aware contrastive learning with former modules to maintain categorical distinctiveness.Extensive experimental results on FS-SUNRGBD and FS-ScanNet datasets demonstrate the effectiveness of this method with average precision exceeding the baseline by up to 8%.展开更多
Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information ...Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.展开更多
Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as ...Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as weak user-item interaction supervisory signals and noise in the knowledge graph.To tackle these issues,this paper proposes a neighbor information contrast-enhanced recommendation method by adding subtle noise to construct contrast views and employing contrastive learning to strengthen supervisory signals and reduce knowledge noise.Specifically,first,this paper adopts heterogeneous propagation and knowledge-aware attention networks to obtain multi-order neighbor embedding of users and items,mining the high-order neighbor informa-tion of users and items.Next,in the neighbor information,this paper introduces weak noise following a uniform distribution to construct neighbor contrast views,effectively reducing the time overhead of view construction.This paper then performs contrastive learning between neighbor views to promote the uniformity of view information,adjusting the neighbor structure,and achieving the goal of reducing the knowledge noise in the knowledge graph.Finally,this paper introduces multi-task learning to mitigate the problem of weak supervisory signals.To validate the effectiveness of our method,experiments are conducted on theMovieLens-1M,MovieLens-20M,Book-Crossing,and Last-FM datasets.The results showthat compared to the best baselines,our method shows significant improvements in AUC and F1.展开更多
基金supported by the Research Grant Fund from Kwangwoon University in 2023,the National Natural Science Foundation of China under Grant(62311540155)the Taishan Scholars Project Special Funds(tsqn202312035)the open research foundation of State Key Laboratory of Integrated Chips and Systems.
文摘Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.
文摘The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced metering infrastructure services.However,this digital transformation also exposes power system to evolving threats,ranging from cyber intrusions and electricity theft to device malfunctions,and the unpredictable nature of these anomalies,coupled with the scarcity of labeled fault data,makes realtime detection exceptionally challenging.To address these difficulties,a real-time decision support framework is presented for smart meter anomality detection that leverages rolling time windows and two self-supervised contrastive learning modules.The first module synthesizes diverse negative samples to overcome the lack of labeled anomalies,while the second captures intrinsic temporal patterns for enhanced contextual discrimination.The end-to-end framework continuously updates its model with rolling updated meter data to deliver timely identification of emerging abnormal behaviors in evolving grids.Extensive evaluations on eight publicly available smart meter datasets over seven diverse abnormal patterns testing demonstrate the effectiveness of the proposed full framework,achieving average recall and F1 score of more than 0.85.
基金M.Faheem is supported by VTT Technical Research Center of Finland.
文摘Neural machine translation(NMT)has advanced with deep learning and large-scale multilingual models,yet translating lowresource languages often lacks sufficient training data and leads to hallucinations.This often results in translated content that diverges significantly from the source text.This research proposes a refined Contrastive Decoding(CD)algorithm that dynamically adjusts weights of log probabilities from strong expert and weak amateur models to mitigate hallucinations in lowresource NMT and improve translation quality.Advanced large language NMT models,including ChatGLM and LLaMA,are fine-tuned and implemented for their superior contextual understanding and cross-lingual capabilities.The refined CD algorithm evaluates multiple candidate translations using BLEU score,semantic similarity,and Named Entity Recognition accuracy.Extensive experimental results show substantial improvements in translation quality and a significant reduction in hallucination rates.Fine-tuned models achieve higher evaluation metrics compared to baseline models and state-of-the-art models.An ablation study confirms the contributions of each methodological component and highlights the effectiveness of the refined CD algorithm and advanced models in mitigating hallucinations.Notably,the refined methodology increased the BLEU score by approximately 30%compared to baseline models.
基金granted by Qin Xin Talents Cultivation Program(No.QXTCP C202115),Beijing Information Science&Technology Universitythe Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing Fund(No.GJJ-23),National Social Science Foundation,China(No.21BTQ079).
文摘Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).
基金supported by the ScientificResearch and Innovation Team Program of Sichuan University of Science and Technology(No.SUSE652A006)Sichuan Key Provincial Research Base of Intelligent Tourism(ZHYJ22-03)In addition,it is also listed as a project of Sichuan Provincial Science and Technology Programme(2022YFG0028).
文摘Emotion recognition plays a crucial role in various fields and is a key task in natural language processing (NLP). The objective is to identify and interpret emotional expressions in text. However, traditional emotion recognition approaches often struggle in few-shot cross-domain scenarios due to their limited capacity to generalize semantic features across different domains. Additionally, these methods face challenges in accurately capturing complex emotional states, particularly those that are subtle or implicit. To overcome these limitations, we introduce a novel approach called Dual-Task Contrastive Meta-Learning (DTCML). This method combines meta-learning and contrastive learning to improve emotion recognition. Meta-learning enhances the model’s ability to generalize to new emotional tasks, while instance contrastive learning further refines the model by distinguishing unique features within each category, enabling it to better differentiate complex emotional expressions. Prototype contrastive learning, in turn, helps the model address the semantic complexity of emotions across different domains, enabling the model to learn fine-grained emotions expression. By leveraging dual tasks, DTCML learns from two domains simultaneously, the model is encouraged to learn more diverse and generalizable emotions features, thereby improving its cross-domain adaptability and robustness, and enhancing its generalization ability. We evaluated the performance of DTCML across four cross-domain settings, and the results show that our method outperforms the best baseline by 5.88%, 12.04%, 8.49%, and 8.40% in terms of accuracy.
基金supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region(Grant No.2022D01B 187)。
文摘Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure challenges in edge environments.However,the diversity of clients in edge cloud computing presents significant challenges for FL.Personalized federated learning(pFL)received considerable attention in recent years.One example of pFL involves exploiting the global and local information in the local model.Current pFL algorithms experience limitations such as slow convergence speed,catastrophic forgetting,and poor performance in complex tasks,which still have significant shortcomings compared to the centralized learning.To achieve high pFL performance,we propose FedCLCC:Federated Contrastive Learning and Conditional Computing.The core of FedCLCC is the use of contrastive learning and conditional computing.Contrastive learning determines the feature representation similarity to adjust the local model.Conditional computing separates the global and local information and feeds it to their corresponding heads for global and local handling.Our comprehensive experiments demonstrate that FedCLCC outperforms other state-of-the-art FL algorithms.
文摘Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image distortions,which pose challenges for object detection tasks.To address this issue,we propose Yolo-CaSKA(Yolo with Contrastive Learning and Selective Kernel Attention),a novel training method that enhances object detection on fisheye camera images.The standard image and the corresponding distorted fisheye image pairs are used as positive samples,and the rest of the image pairs are used as negative samples,which are guided by contrastive learning to help the distorted images find the feature vectors of the corresponding normal images,to improve the detection accuracy.Additionally,we incorporate the Selective Kernel(SK)attention module to focus on regions prone to false detections,such as image edges and blind spots.Finally,the mAP_(50) on the augmented KITTI dataset is improved by 5.5% over the original Yolov8,while the mAP_(50) on the WoodScape dataset is improved by 2.6% compared to OmniDet.The results demonstrate the performance of our proposed model for object detection on fisheye images.
基金supported by the National Key R&D Program of China(Grant No.2022YFB3104601)the Big Data Computing Center of Southeast University.
文摘Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,the volume of trustable information far exceeds that of rumors,resulting in a class imbalance that leads models to prioritize the majority class during training.This focus diminishes the model’s ability to recognize minority class samples.Furthermore,models may experience overfitting when encountering these minority samples,further compromising their generalization capabilities.Unlike node-level classification tasks,fake news detection in social networks operates on graph-level samples,where traditional interpolation and oversampling methods struggle to effectively generate high-quality graph-level samples.This challenge complicates the identification of new instances of false information.To address this issue,this paper introduces the FHGraph(Fake News Hunting Graph)framework,which employs a generative data augmentation approach and a latent diffusion model to create graph structures that align with news communication patterns.Using the few-sample learning capabilities of large language models(LLMs),the framework generates diverse texts for minority class nodes.FHGraph comprises a hierarchical multiview graph contrastive learning module,in which two horizontal views and three vertical levels are utilized for self-supervised learning,resulting in more optimized representations.Experimental results show that FHGraph significantly outperforms state-of-the-art(SOTA)graph-level class imbalance methods and SOTA graph-level contrastive learning methods.Specifically,FHGraph has achieved a 2%increase in F1 Micro and a 2.5%increase in F1 Macro in the PHEME dataset,as well as a 3.5%improvement in F1 Micro and a 4.3%improvement in F1 Macro on RumorEval dataset.
文摘In the field of optoelectronics,certain types of data may be difficult to accurately annotate,such as high-resolution optoelectronic imaging or imaging in certain special spectral ranges.Weakly supervised learning can provide a more reliable approach in these situations.Current popular approaches mainly adopt the classification-based class activation maps(CAM)as initial pseudo labels to solve the task.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers CSTB2024TIAD-CYKJCXX0009,CSTB2024NSCQ-LZX0043,CSTB2022NSCQ-MSX0288Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology Graduate Education High-Quality Development Project,grant number gzlsz202401the Chongqing University of Technology—Chongqing LINGLUE Technology Co.,Ltd.Electronic Information(Artificial Intelligence)Graduate Joint Training Basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.Computer Technology Graduate Joint Training Base.
文摘Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world applications.Few-shot object detection presents a new research idea that aims to localize and classify objects in images using only limited annotated examples.However,the inherent challenge in few-shot object detection lies in the insufficient sample diversity to fully characterize the sample feature distribution,which consequently impacts model performance.Inspired by contrastive learning principles,we propose an Implicit Feature Contrastive Learning(IFCL)module to address this limitation and augment feature diversity for more robust representational learning.This module generates augmented support sample features in a mixed feature space and implicitly contrasts them with query Region of Interest(RoI)features.This approach facilitates more comprehensive learning of both intra-class feature similarity and inter-class feature diversity,thereby enhancing the model’s object classification and localization capabilities.Extensive experiments on PASCAL VOC show that our method achieves a respective improvement of 3.2%,1.8%,and 2.3%on 10-shot of three Novel Sets compared to the baseline model FPD.
基金supported by the National Key Project of the National Natural Science Foundation of China under Grant No.U23A20305.
文摘Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often lead to yield inaccurate features of influential users due to neighborhood aggregation,and require a large substantial amount of labeled data for training,making them difficult and challenging to apply in practice.To address this issue,we propose a semi-supervised contrastive learning method for identifying influential users.First,the proposed method constructs positive and negative samples for contrastive learning based on multiple node centrality metrics related to influence;then,contrastive learning is employed to guide the encoder to generate various influence-related features for users;finally,with only a small amount of labeled data,an attention-based user classifier is trained to accurately identify influential users.Experiments conducted on three public social network datasets demonstrate that the proposed method,using only 20%of the labeled data as the training set,achieves F1 values that are 5.9%,5.8%,and 8.7%higher than those unsupervised EVC method,and it matches the performance of GNN-based methods such as DeepInf,InfGCN and OlapGN,which require 80%of labeled data as the training set.
文摘Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph augmentation strategies,which can harm the semantic and structural information of graphs and overlook the rich structural information present in subgraphs.To address these issues,we propose a graph similarity learning model based on learnable augmentation and multi-level contrastive learning.First,to tackle the problem of random augmentation disrupting the semantics and structure of the graph,we design a learnable augmentation method to selectively choose nodes and edges within the graph.To enhance contrastive levels,we employ a biased random walk method to generate corresponding subgraphs,enriching the contrastive hierarchy.Second,to solve the issue of previous work not considering multi-level contrastive learning,we utilize graph convolutional networks to learn node representations of augmented views and the original graph and calculate the interaction information between the attribute-augmented and structure-augmented views and the original graph.The goal is to maximize node consistency between different views and learn node matching between different graphs,resulting in node-level representations for each graph.Subgraph representations are then obtained through pooling operations,and we conduct contrastive learning utilizing both node and subgraph representations.Finally,the graph similarity score is computed according to different downstream tasks.We conducted three sets of experiments across eight datasets,and the results demonstrate that the proposed model effectively mitigates the issues of random augmentation damaging the original graph’s semantics and structure,as well as the insufficiency of contrastive levels.Additionally,the model achieves the best overall performance.
基金supported by the National Natural Science Foundation of China under Grant Nos.62461037,62076117 and 62166026the Jiangxi Provincial Natural Science Foundation under Grant Nos.20224BAB212011,20232BAB202051,20232BAB212008 and 20242BAB25078the Jiangxi Provincial Key Laboratory of Virtual Reality under Grant No.2024SSY03151.
文摘The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.
基金the National Natural Science Foundation of China(No.52375254)the Interdisciplinary Program of Shanghai Jiao Tong University(No.21X010301670)the Open Project Program of SJTU-Pinghu Institute of Intelligent Optoelectronics(No.2022SPIOE104)。
文摘Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of the existing research only focused on public databases with channel systems incompatible with the current clinical measurements.To narrow the gap between theoretical models and real clinical practice,we propose a novel deep learning model,by combining the vision transformer with supervised contrastive learning,realizing the efficient sleep stages classification.Experimental results show that the model facilitates an easier classification of multi-channel PSG signals.The mean F1-scores of 79.2%and 76.5%on two public databases outperform the previous studies,showing the model’s great capability,and the performance of the proposed method on the children’s small database also presents a high mean accuracy of 88.6%.Our proposed model is validated not only on the public databases but the provided clinical database to strictly evaluate its clinical usage in practice.
文摘Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the novel classes.Due to imbalanced training data,existing FS3D methods based on fully supervised learning can lead to overfitting toward base classes,which impairs the network’s ability to generalize knowledge learned from base classes to novel classes and also prevents the network from extracting distinctive foreground and background representations for novel class objects.To address these issues,this thesis proposes a category-agnostic contrastive learning approach,enhancing the generalization and identification abilities for almost unseen categories through the construction of pseudo-labels and positive-negative sample pairs unrelated to specific classes.Firstly,this thesis designs a proposal-wise context contrastive module(CCM).By reducing the distance between foreground point features and increasing the distance between foreground and background point features within a region proposal,CCM aids the network in extracting more discriminative foreground and background feature representations without reliance on categorical annotations.Secondly,this thesis utilizes a geometric contrastive module(GCM),which enhances the network’s geometric perception capability by employing contrastive learning on the foreground point features associated with various basic geometric components,such as edges,corners,and surfaces,thereby enabling these geometric components to exhibit more distinguishable representations.This thesis also combines category-aware contrastive learning with former modules to maintain categorical distinctiveness.Extensive experimental results on FS-SUNRGBD and FS-ScanNet datasets demonstrate the effectiveness of this method with average precision exceeding the baseline by up to 8%.
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
文摘Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.
基金supported by the Natural Science Foundation of Ningxia Province(No.2023AAC03316)the Ningxia Hui Autonomous Region Education Department Higher Edu-cation Key Scientific Research Project(No.NYG2022051)the North Minzu University Graduate Innovation Project(YCX23146).
文摘Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as weak user-item interaction supervisory signals and noise in the knowledge graph.To tackle these issues,this paper proposes a neighbor information contrast-enhanced recommendation method by adding subtle noise to construct contrast views and employing contrastive learning to strengthen supervisory signals and reduce knowledge noise.Specifically,first,this paper adopts heterogeneous propagation and knowledge-aware attention networks to obtain multi-order neighbor embedding of users and items,mining the high-order neighbor informa-tion of users and items.Next,in the neighbor information,this paper introduces weak noise following a uniform distribution to construct neighbor contrast views,effectively reducing the time overhead of view construction.This paper then performs contrastive learning between neighbor views to promote the uniformity of view information,adjusting the neighbor structure,and achieving the goal of reducing the knowledge noise in the knowledge graph.Finally,this paper introduces multi-task learning to mitigate the problem of weak supervisory signals.To validate the effectiveness of our method,experiments are conducted on theMovieLens-1M,MovieLens-20M,Book-Crossing,and Last-FM datasets.The results showthat compared to the best baselines,our method shows significant improvements in AUC and F1.