Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the pro...Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.展开更多
The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced met...The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced metering infrastructure services.However,this digital transformation also exposes power system to evolving threats,ranging from cyber intrusions and electricity theft to device malfunctions,and the unpredictable nature of these anomalies,coupled with the scarcity of labeled fault data,makes realtime detection exceptionally challenging.To address these difficulties,a real-time decision support framework is presented for smart meter anomality detection that leverages rolling time windows and two self-supervised contrastive learning modules.The first module synthesizes diverse negative samples to overcome the lack of labeled anomalies,while the second captures intrinsic temporal patterns for enhanced contextual discrimination.The end-to-end framework continuously updates its model with rolling updated meter data to deliver timely identification of emerging abnormal behaviors in evolving grids.Extensive evaluations on eight publicly available smart meter datasets over seven diverse abnormal patterns testing demonstrate the effectiveness of the proposed full framework,achieving average recall and F1 score of more than 0.85.展开更多
Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing ...Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).展开更多
Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure ...Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure challenges in edge environments.However,the diversity of clients in edge cloud computing presents significant challenges for FL.Personalized federated learning(pFL)received considerable attention in recent years.One example of pFL involves exploiting the global and local information in the local model.Current pFL algorithms experience limitations such as slow convergence speed,catastrophic forgetting,and poor performance in complex tasks,which still have significant shortcomings compared to the centralized learning.To achieve high pFL performance,we propose FedCLCC:Federated Contrastive Learning and Conditional Computing.The core of FedCLCC is the use of contrastive learning and conditional computing.Contrastive learning determines the feature representation similarity to adjust the local model.Conditional computing separates the global and local information and feeds it to their corresponding heads for global and local handling.Our comprehensive experiments demonstrate that FedCLCC outperforms other state-of-the-art FL algorithms.展开更多
Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,t...Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,the volume of trustable information far exceeds that of rumors,resulting in a class imbalance that leads models to prioritize the majority class during training.This focus diminishes the model’s ability to recognize minority class samples.Furthermore,models may experience overfitting when encountering these minority samples,further compromising their generalization capabilities.Unlike node-level classification tasks,fake news detection in social networks operates on graph-level samples,where traditional interpolation and oversampling methods struggle to effectively generate high-quality graph-level samples.This challenge complicates the identification of new instances of false information.To address this issue,this paper introduces the FHGraph(Fake News Hunting Graph)framework,which employs a generative data augmentation approach and a latent diffusion model to create graph structures that align with news communication patterns.Using the few-sample learning capabilities of large language models(LLMs),the framework generates diverse texts for minority class nodes.FHGraph comprises a hierarchical multiview graph contrastive learning module,in which two horizontal views and three vertical levels are utilized for self-supervised learning,resulting in more optimized representations.Experimental results show that FHGraph significantly outperforms state-of-the-art(SOTA)graph-level class imbalance methods and SOTA graph-level contrastive learning methods.Specifically,FHGraph has achieved a 2%increase in F1 Micro and a 2.5%increase in F1 Macro in the PHEME dataset,as well as a 3.5%improvement in F1 Micro and a 4.3%improvement in F1 Macro on RumorEval dataset.展开更多
Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world appli...Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world applications.Few-shot object detection presents a new research idea that aims to localize and classify objects in images using only limited annotated examples.However,the inherent challenge in few-shot object detection lies in the insufficient sample diversity to fully characterize the sample feature distribution,which consequently impacts model performance.Inspired by contrastive learning principles,we propose an Implicit Feature Contrastive Learning(IFCL)module to address this limitation and augment feature diversity for more robust representational learning.This module generates augmented support sample features in a mixed feature space and implicitly contrasts them with query Region of Interest(RoI)features.This approach facilitates more comprehensive learning of both intra-class feature similarity and inter-class feature diversity,thereby enhancing the model’s object classification and localization capabilities.Extensive experiments on PASCAL VOC show that our method achieves a respective improvement of 3.2%,1.8%,and 2.3%on 10-shot of three Novel Sets compared to the baseline model FPD.展开更多
Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often l...Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often lead to yield inaccurate features of influential users due to neighborhood aggregation,and require a large substantial amount of labeled data for training,making them difficult and challenging to apply in practice.To address this issue,we propose a semi-supervised contrastive learning method for identifying influential users.First,the proposed method constructs positive and negative samples for contrastive learning based on multiple node centrality metrics related to influence;then,contrastive learning is employed to guide the encoder to generate various influence-related features for users;finally,with only a small amount of labeled data,an attention-based user classifier is trained to accurately identify influential users.Experiments conducted on three public social network datasets demonstrate that the proposed method,using only 20%of the labeled data as the training set,achieves F1 values that are 5.9%,5.8%,and 8.7%higher than those unsupervised EVC method,and it matches the performance of GNN-based methods such as DeepInf,InfGCN and OlapGN,which require 80%of labeled data as the training set.展开更多
Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image d...Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image distortions,which pose challenges for object detection tasks.To address this issue,we propose Yolo-CaSKA(Yolo with Contrastive Learning and Selective Kernel Attention),a novel training method that enhances object detection on fisheye camera images.The standard image and the corresponding distorted fisheye image pairs are used as positive samples,and the rest of the image pairs are used as negative samples,which are guided by contrastive learning to help the distorted images find the feature vectors of the corresponding normal images,to improve the detection accuracy.Additionally,we incorporate the Selective Kernel(SK)attention module to focus on regions prone to false detections,such as image edges and blind spots.Finally,the mAP_(50) on the augmented KITTI dataset is improved by 5.5% over the original Yolov8,while the mAP_(50) on the WoodScape dataset is improved by 2.6% compared to OmniDet.The results demonstrate the performance of our proposed model for object detection on fisheye images.展开更多
Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph aug...Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph augmentation strategies,which can harm the semantic and structural information of graphs and overlook the rich structural information present in subgraphs.To address these issues,we propose a graph similarity learning model based on learnable augmentation and multi-level contrastive learning.First,to tackle the problem of random augmentation disrupting the semantics and structure of the graph,we design a learnable augmentation method to selectively choose nodes and edges within the graph.To enhance contrastive levels,we employ a biased random walk method to generate corresponding subgraphs,enriching the contrastive hierarchy.Second,to solve the issue of previous work not considering multi-level contrastive learning,we utilize graph convolutional networks to learn node representations of augmented views and the original graph and calculate the interaction information between the attribute-augmented and structure-augmented views and the original graph.The goal is to maximize node consistency between different views and learn node matching between different graphs,resulting in node-level representations for each graph.Subgraph representations are then obtained through pooling operations,and we conduct contrastive learning utilizing both node and subgraph representations.Finally,the graph similarity score is computed according to different downstream tasks.We conducted three sets of experiments across eight datasets,and the results demonstrate that the proposed model effectively mitigates the issues of random augmentation damaging the original graph’s semantics and structure,as well as the insufficiency of contrastive levels.Additionally,the model achieves the best overall performance.展开更多
The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compare...The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.展开更多
Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of t...Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of the existing research only focused on public databases with channel systems incompatible with the current clinical measurements.To narrow the gap between theoretical models and real clinical practice,we propose a novel deep learning model,by combining the vision transformer with supervised contrastive learning,realizing the efficient sleep stages classification.Experimental results show that the model facilitates an easier classification of multi-channel PSG signals.The mean F1-scores of 79.2%and 76.5%on two public databases outperform the previous studies,showing the model’s great capability,and the performance of the proposed method on the children’s small database also presents a high mean accuracy of 88.6%.Our proposed model is validated not only on the public databases but the provided clinical database to strictly evaluate its clinical usage in practice.展开更多
Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the nove...Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the novel classes.Due to imbalanced training data,existing FS3D methods based on fully supervised learning can lead to overfitting toward base classes,which impairs the network’s ability to generalize knowledge learned from base classes to novel classes and also prevents the network from extracting distinctive foreground and background representations for novel class objects.To address these issues,this thesis proposes a category-agnostic contrastive learning approach,enhancing the generalization and identification abilities for almost unseen categories through the construction of pseudo-labels and positive-negative sample pairs unrelated to specific classes.Firstly,this thesis designs a proposal-wise context contrastive module(CCM).By reducing the distance between foreground point features and increasing the distance between foreground and background point features within a region proposal,CCM aids the network in extracting more discriminative foreground and background feature representations without reliance on categorical annotations.Secondly,this thesis utilizes a geometric contrastive module(GCM),which enhances the network’s geometric perception capability by employing contrastive learning on the foreground point features associated with various basic geometric components,such as edges,corners,and surfaces,thereby enabling these geometric components to exhibit more distinguishable representations.This thesis also combines category-aware contrastive learning with former modules to maintain categorical distinctiveness.Extensive experimental results on FS-SUNRGBD and FS-ScanNet datasets demonstrate the effectiveness of this method with average precision exceeding the baseline by up to 8%.展开更多
Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information ...Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.展开更多
Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real ...Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real degradation is not consistent with the assumption.To deal with real-world scenarios,existing blind SR methods are committed to estimating both the degradation and the super-resolved image with an extra loss or iterative scheme.However,degradation estimation that requires more computation would result in limited SR performance due to the accumulated estimation errors.In this paper,we propose a contrastive regularization built upon contrastive learning to exploit both the information of blurry images and clear images as negative and positive samples,respectively.Contrastive regularization ensures that the restored image is pulled closer to the clear image and pushed far away from the blurry image in the representation space.Furthermore,instead of estimating the degradation,we extract global statistical prior information to capture the character of the distortion.Considering the coupling between the degradation and the low-resolution image,we embed the global prior into the distortion-specific SR network to make our method adaptive to the changes of distortions.We term our distortion-specific network with contrastive regularization as CRDNet.The extensive experiments on synthetic and realworld scenes demonstrate that our lightweight CRDNet surpasses state-of-the-art blind super-resolution approaches.展开更多
In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is design...In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.展开更多
This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to...This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.展开更多
Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These li...Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These limitations can result in the misjudgment of models,leading to a degradation in overall detection performance.This paper proposes a novel transformer-like anomaly detection model adopting a contrastive learning module and a memory block(CLME)to overcome the above limitations.The contrastive learning module tailored for time series data can learn the contextual relationships to generate temporal fine-grained representations.The memory block can record normal patterns of these representations through the utilization of attention-based addressing and reintegration mechanisms.These two modules together effectively alleviate the problem of generalization.Furthermore,this paper introduces a fusion anomaly detection strategy that comprehensively takes into account the residual and feature spaces.Such a strategy can enlarge the discrepancies between normal and abnormal data,which is more conducive to anomaly identification.The proposed CLME model not only efficiently enhances the generalization performance but also improves the ability of anomaly detection.To validate the efficacy of the proposed approach,extensive experiments are conducted on well-established benchmark datasets,including SWaT,PSM,WADI,and MSL.The results demonstrate outstanding performance,with F1 scores of 90.58%,94.83%,91.58%,and 91.75%,respectively.These findings affirm the superiority of the CLME model over existing stateof-the-art anomaly detection methodologies in terms of its ability to detect anomalies within complex datasets accurately.展开更多
Smart manufacturing suffers from the heterogeneity of local data distribution across parties,mutual information silos and lack of privacy protection in the process of industry chain collaboration.To address these prob...Smart manufacturing suffers from the heterogeneity of local data distribution across parties,mutual information silos and lack of privacy protection in the process of industry chain collaboration.To address these problems,we propose a federated domain adaptation algorithm based on knowledge distillation and contrastive learning.Knowledge distillation is used to extract transferable integration knowledge from the different source domains and the quality of the extracted integration knowledge is used to assign reasonable weights to each source domain.A more rational weighted average aggregation is used in the aggregation phase of the center server to optimize the global model,while the local model of the source domain is trained with the help of contrastive learning to constrain the local model optimum towards the global model optimum,mitigating the inherent heterogeneity between local data.Our experiments are conducted on the largest domain adaptation dataset,and the results show that compared with other traditional federated domain adaptation algorithms,the algorithm we proposed trains a more accurate model,requires fewer communication rounds,makes more effective use of imbalanced data in the industrial area,and protects data privacy.展开更多
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on...Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.展开更多
Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction betwe...Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction between bundle and item views and relied on simplistic methods for predicting user-bundle relationships. To address this limitation, we propose Hybrid Contrastive Learning for Bundle Recommendation (HCLBR). Our approach integrates unsupervised and supervised contrastive learning to enrich user and bundle representations, promoting diversity. By leveraging interconnected views of user-item and user-bundle nodes, HCLBR enhances representation learning for robust recommendations. Evaluation on four public datasets demonstrates the superior performance of HCLBR over state-of-the-art baselines. Our findings highlight the significance of leveraging contrastive learning and interconnected views in bundle recommendation, providing valuable insights for marketing strategies and recommendation system design.展开更多
基金supported by the Research Grant Fund from Kwangwoon University in 2023,the National Natural Science Foundation of China under Grant(62311540155)the Taishan Scholars Project Special Funds(tsqn202312035)the open research foundation of State Key Laboratory of Integrated Chips and Systems.
文摘Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.
文摘The rapid integration of Internet of Things(IoT)technologies is reshaping the global energy landscape by deploying smart meters that enable high-resolution consumption monitoring,two-way communication,and advanced metering infrastructure services.However,this digital transformation also exposes power system to evolving threats,ranging from cyber intrusions and electricity theft to device malfunctions,and the unpredictable nature of these anomalies,coupled with the scarcity of labeled fault data,makes realtime detection exceptionally challenging.To address these difficulties,a real-time decision support framework is presented for smart meter anomality detection that leverages rolling time windows and two self-supervised contrastive learning modules.The first module synthesizes diverse negative samples to overcome the lack of labeled anomalies,while the second captures intrinsic temporal patterns for enhanced contextual discrimination.The end-to-end framework continuously updates its model with rolling updated meter data to deliver timely identification of emerging abnormal behaviors in evolving grids.Extensive evaluations on eight publicly available smart meter datasets over seven diverse abnormal patterns testing demonstrate the effectiveness of the proposed full framework,achieving average recall and F1 score of more than 0.85.
基金granted by Qin Xin Talents Cultivation Program(No.QXTCP C202115),Beijing Information Science&Technology Universitythe Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing Fund(No.GJJ-23),National Social Science Foundation,China(No.21BTQ079).
文摘Sarcasm detection is a complex and challenging task,particularly in the context of Chinese social media,where it exhibits strong contextual dependencies and cultural specificity.To address the limitations of existing methods in capturing the implicit semantics and contextual associations in sarcastic expressions,this paper proposes an event-aware model for Chinese sarcasm detection,leveraging a multi-head attention(MHA)mechanism and contrastive learning(CL)strategies.The proposed model employs a dual-path Bidirectional Encoder Representations from Transformers(BERT)encoder to process comment text and event context separately and integrates an MHA mechanism to facilitate deep interactions between the two,thereby capturing multidimensional semantic associations.Additionally,a CL strategy is introduced to enhance feature representation capabilities,further improving the model’s performance in handling class imbalance and complex contextual scenarios.The model achieves state-of-the-art performance on the Chinese sarcasm dataset,with significant improvements in accuracy(79.55%),F1-score(84.22%),and an area under the curve(AUC,84.35%).
基金supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region(Grant No.2022D01B 187)。
文摘Federated learning(FL)is a distributed machine learning paradigm for edge cloud computing.FL can facilitate data-driven decision-making in tactical scenarios,effectively addressing both data volume and infrastructure challenges in edge environments.However,the diversity of clients in edge cloud computing presents significant challenges for FL.Personalized federated learning(pFL)received considerable attention in recent years.One example of pFL involves exploiting the global and local information in the local model.Current pFL algorithms experience limitations such as slow convergence speed,catastrophic forgetting,and poor performance in complex tasks,which still have significant shortcomings compared to the centralized learning.To achieve high pFL performance,we propose FedCLCC:Federated Contrastive Learning and Conditional Computing.The core of FedCLCC is the use of contrastive learning and conditional computing.Contrastive learning determines the feature representation similarity to adjust the local model.Conditional computing separates the global and local information and feeds it to their corresponding heads for global and local handling.Our comprehensive experiments demonstrate that FedCLCC outperforms other state-of-the-art FL algorithms.
基金supported by the National Key R&D Program of China(Grant No.2022YFB3104601)the Big Data Computing Center of Southeast University.
文摘Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social stability.In real-world contexts,the volume of trustable information far exceeds that of rumors,resulting in a class imbalance that leads models to prioritize the majority class during training.This focus diminishes the model’s ability to recognize minority class samples.Furthermore,models may experience overfitting when encountering these minority samples,further compromising their generalization capabilities.Unlike node-level classification tasks,fake news detection in social networks operates on graph-level samples,where traditional interpolation and oversampling methods struggle to effectively generate high-quality graph-level samples.This challenge complicates the identification of new instances of false information.To address this issue,this paper introduces the FHGraph(Fake News Hunting Graph)framework,which employs a generative data augmentation approach and a latent diffusion model to create graph structures that align with news communication patterns.Using the few-sample learning capabilities of large language models(LLMs),the framework generates diverse texts for minority class nodes.FHGraph comprises a hierarchical multiview graph contrastive learning module,in which two horizontal views and three vertical levels are utilized for self-supervised learning,resulting in more optimized representations.Experimental results show that FHGraph significantly outperforms state-of-the-art(SOTA)graph-level class imbalance methods and SOTA graph-level contrastive learning methods.Specifically,FHGraph has achieved a 2%increase in F1 Micro and a 2.5%increase in F1 Macro in the PHEME dataset,as well as a 3.5%improvement in F1 Micro and a 4.3%improvement in F1 Macro on RumorEval dataset.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers CSTB2024TIAD-CYKJCXX0009,CSTB2024NSCQ-LZX0043,CSTB2022NSCQ-MSX0288Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology Graduate Education High-Quality Development Project,grant number gzlsz202401the Chongqing University of Technology—Chongqing LINGLUE Technology Co.,Ltd.Electronic Information(Artificial Intelligence)Graduate Joint Training Basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.Computer Technology Graduate Joint Training Base.
文摘Although conventional object detection methods achieve high accuracy through extensively annotated datasets,acquiring such large-scale labeled data remains challenging and cost-prohibitive in numerous real-world applications.Few-shot object detection presents a new research idea that aims to localize and classify objects in images using only limited annotated examples.However,the inherent challenge in few-shot object detection lies in the insufficient sample diversity to fully characterize the sample feature distribution,which consequently impacts model performance.Inspired by contrastive learning principles,we propose an Implicit Feature Contrastive Learning(IFCL)module to address this limitation and augment feature diversity for more robust representational learning.This module generates augmented support sample features in a mixed feature space and implicitly contrasts them with query Region of Interest(RoI)features.This approach facilitates more comprehensive learning of both intra-class feature similarity and inter-class feature diversity,thereby enhancing the model’s object classification and localization capabilities.Extensive experiments on PASCAL VOC show that our method achieves a respective improvement of 3.2%,1.8%,and 2.3%on 10-shot of three Novel Sets compared to the baseline model FPD.
基金supported by the National Key Project of the National Natural Science Foundation of China under Grant No.U23A20305.
文摘Identifying influential users in social networks is of great significance in areas such as public opinion monitoring and commercial promotion.Existing identification methods based on Graph Neural Networks(GNNs)often lead to yield inaccurate features of influential users due to neighborhood aggregation,and require a large substantial amount of labeled data for training,making them difficult and challenging to apply in practice.To address this issue,we propose a semi-supervised contrastive learning method for identifying influential users.First,the proposed method constructs positive and negative samples for contrastive learning based on multiple node centrality metrics related to influence;then,contrastive learning is employed to guide the encoder to generate various influence-related features for users;finally,with only a small amount of labeled data,an attention-based user classifier is trained to accurately identify influential users.Experiments conducted on three public social network datasets demonstrate that the proposed method,using only 20%of the labeled data as the training set,achieves F1 values that are 5.9%,5.8%,and 8.7%higher than those unsupervised EVC method,and it matches the performance of GNN-based methods such as DeepInf,InfGCN and OlapGN,which require 80%of labeled data as the training set.
文摘Fisheye cameras offer a significantly larger field of view compared to conventional cameras,making them valuable tools in the field of computer vision.However,their unique optical characteristics often lead to image distortions,which pose challenges for object detection tasks.To address this issue,we propose Yolo-CaSKA(Yolo with Contrastive Learning and Selective Kernel Attention),a novel training method that enhances object detection on fisheye camera images.The standard image and the corresponding distorted fisheye image pairs are used as positive samples,and the rest of the image pairs are used as negative samples,which are guided by contrastive learning to help the distorted images find the feature vectors of the corresponding normal images,to improve the detection accuracy.Additionally,we incorporate the Selective Kernel(SK)attention module to focus on regions prone to false detections,such as image edges and blind spots.Finally,the mAP_(50) on the augmented KITTI dataset is improved by 5.5% over the original Yolov8,while the mAP_(50) on the WoodScape dataset is improved by 2.6% compared to OmniDet.The results demonstrate the performance of our proposed model for object detection on fisheye images.
文摘Graph similarity learning aims to calculate the similarity between pairs of graphs.Existing unsupervised graph similarity learning methods based on contrastive learning encounter challenges related to random graph augmentation strategies,which can harm the semantic and structural information of graphs and overlook the rich structural information present in subgraphs.To address these issues,we propose a graph similarity learning model based on learnable augmentation and multi-level contrastive learning.First,to tackle the problem of random augmentation disrupting the semantics and structure of the graph,we design a learnable augmentation method to selectively choose nodes and edges within the graph.To enhance contrastive levels,we employ a biased random walk method to generate corresponding subgraphs,enriching the contrastive hierarchy.Second,to solve the issue of previous work not considering multi-level contrastive learning,we utilize graph convolutional networks to learn node representations of augmented views and the original graph and calculate the interaction information between the attribute-augmented and structure-augmented views and the original graph.The goal is to maximize node consistency between different views and learn node matching between different graphs,resulting in node-level representations for each graph.Subgraph representations are then obtained through pooling operations,and we conduct contrastive learning utilizing both node and subgraph representations.Finally,the graph similarity score is computed according to different downstream tasks.We conducted three sets of experiments across eight datasets,and the results demonstrate that the proposed model effectively mitigates the issues of random augmentation damaging the original graph’s semantics and structure,as well as the insufficiency of contrastive levels.Additionally,the model achieves the best overall performance.
基金supported by the National Natural Science Foundation of China under Grant Nos.62461037,62076117 and 62166026the Jiangxi Provincial Natural Science Foundation under Grant Nos.20224BAB212011,20232BAB202051,20232BAB212008 and 20242BAB25078the Jiangxi Provincial Key Laboratory of Virtual Reality under Grant No.2024SSY03151.
文摘The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.
基金the National Natural Science Foundation of China(No.52375254)the Interdisciplinary Program of Shanghai Jiao Tong University(No.21X010301670)the Open Project Program of SJTU-Pinghu Institute of Intelligent Optoelectronics(No.2022SPIOE104)。
文摘Automated sleep stages classification facilitates clinical experts in conducting treatment for sleep disorders,as it is more time-efficient concerning the analysis of whole-night polysomnography(PSG).However,most of the existing research only focused on public databases with channel systems incompatible with the current clinical measurements.To narrow the gap between theoretical models and real clinical practice,we propose a novel deep learning model,by combining the vision transformer with supervised contrastive learning,realizing the efficient sleep stages classification.Experimental results show that the model facilitates an easier classification of multi-channel PSG signals.The mean F1-scores of 79.2%and 76.5%on two public databases outperform the previous studies,showing the model’s great capability,and the performance of the proposed method on the children’s small database also presents a high mean accuracy of 88.6%.Our proposed model is validated not only on the public databases but the provided clinical database to strictly evaluate its clinical usage in practice.
文摘Few-shot point cloud 3D object detection(FS3D)aims to identify and locate objects of novel classes within point clouds using knowledge acquired from annotated base classes and a minimal number of samples from the novel classes.Due to imbalanced training data,existing FS3D methods based on fully supervised learning can lead to overfitting toward base classes,which impairs the network’s ability to generalize knowledge learned from base classes to novel classes and also prevents the network from extracting distinctive foreground and background representations for novel class objects.To address these issues,this thesis proposes a category-agnostic contrastive learning approach,enhancing the generalization and identification abilities for almost unseen categories through the construction of pseudo-labels and positive-negative sample pairs unrelated to specific classes.Firstly,this thesis designs a proposal-wise context contrastive module(CCM).By reducing the distance between foreground point features and increasing the distance between foreground and background point features within a region proposal,CCM aids the network in extracting more discriminative foreground and background feature representations without reliance on categorical annotations.Secondly,this thesis utilizes a geometric contrastive module(GCM),which enhances the network’s geometric perception capability by employing contrastive learning on the foreground point features associated with various basic geometric components,such as edges,corners,and surfaces,thereby enabling these geometric components to exhibit more distinguishable representations.This thesis also combines category-aware contrastive learning with former modules to maintain categorical distinctiveness.Extensive experimental results on FS-SUNRGBD and FS-ScanNet datasets demonstrate the effectiveness of this method with average precision exceeding the baseline by up to 8%.
文摘Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.
基金supported by the National Natural Science Foundation of China(61971165)the Key Research and Development Program of Hubei Province(2020BAB113)。
文摘Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real degradation is not consistent with the assumption.To deal with real-world scenarios,existing blind SR methods are committed to estimating both the degradation and the super-resolved image with an extra loss or iterative scheme.However,degradation estimation that requires more computation would result in limited SR performance due to the accumulated estimation errors.In this paper,we propose a contrastive regularization built upon contrastive learning to exploit both the information of blurry images and clear images as negative and positive samples,respectively.Contrastive regularization ensures that the restored image is pulled closer to the clear image and pushed far away from the blurry image in the representation space.Furthermore,instead of estimating the degradation,we extract global statistical prior information to capture the character of the distortion.Considering the coupling between the degradation and the low-resolution image,we embed the global prior into the distortion-specific SR network to make our method adaptive to the changes of distortions.We term our distortion-specific network with contrastive regularization as CRDNet.The extensive experiments on synthetic and realworld scenes demonstrate that our lightweight CRDNet surpasses state-of-the-art blind super-resolution approaches.
基金supported by the Fundamental Research Funds for the Central Universities(NOS.NS2019054,NS2020045)。
文摘In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.
基金supported by the NationalNatural Science Foundation of China (No.62107014,Jian P.,62177025,He B.)the Key R&D and Promotion Projects of Henan Province (No.212102210147,Jian P.)Innovative Education Program for Graduate Students at North China University of Water Resources and Electric Power,China (No.YK-2021-99,Guo F.).
文摘This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.
基金support from the Major National Science and Technology Special Projects(2016ZX02301003-004-007)the Natural Science Foundation of Hebei Province(F2020202067)。
文摘Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These limitations can result in the misjudgment of models,leading to a degradation in overall detection performance.This paper proposes a novel transformer-like anomaly detection model adopting a contrastive learning module and a memory block(CLME)to overcome the above limitations.The contrastive learning module tailored for time series data can learn the contextual relationships to generate temporal fine-grained representations.The memory block can record normal patterns of these representations through the utilization of attention-based addressing and reintegration mechanisms.These two modules together effectively alleviate the problem of generalization.Furthermore,this paper introduces a fusion anomaly detection strategy that comprehensively takes into account the residual and feature spaces.Such a strategy can enlarge the discrepancies between normal and abnormal data,which is more conducive to anomaly identification.The proposed CLME model not only efficiently enhances the generalization performance but also improves the ability of anomaly detection.To validate the efficacy of the proposed approach,extensive experiments are conducted on well-established benchmark datasets,including SWaT,PSM,WADI,and MSL.The results demonstrate outstanding performance,with F1 scores of 90.58%,94.83%,91.58%,and 91.75%,respectively.These findings affirm the superiority of the CLME model over existing stateof-the-art anomaly detection methodologies in terms of its ability to detect anomalies within complex datasets accurately.
基金Supported by the Scientific and Technological Innovation 2030—Major Project of"New Generation Artificial Intelligence"(2020AAA0109300)。
文摘Smart manufacturing suffers from the heterogeneity of local data distribution across parties,mutual information silos and lack of privacy protection in the process of industry chain collaboration.To address these problems,we propose a federated domain adaptation algorithm based on knowledge distillation and contrastive learning.Knowledge distillation is used to extract transferable integration knowledge from the different source domains and the quality of the extracted integration knowledge is used to assign reasonable weights to each source domain.A more rational weighted average aggregation is used in the aggregation phase of the center server to optimize the global model,while the local model of the source domain is trained with the help of contrastive learning to constrain the local model optimum towards the global model optimum,mitigating the inherent heterogeneity between local data.Our experiments are conducted on the largest domain adaptation dataset,and the results show that compared with other traditional federated domain adaptation algorithms,the algorithm we proposed trains a more accurate model,requires fewer communication rounds,makes more effective use of imbalanced data in the industrial area,and protects data privacy.
基金supported by Science and Technology Research Project of Jiangxi Education Department.Project Grant No.GJJ2203306.
文摘Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.
文摘Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction between bundle and item views and relied on simplistic methods for predicting user-bundle relationships. To address this limitation, we propose Hybrid Contrastive Learning for Bundle Recommendation (HCLBR). Our approach integrates unsupervised and supervised contrastive learning to enrich user and bundle representations, promoting diversity. By leveraging interconnected views of user-item and user-bundle nodes, HCLBR enhances representation learning for robust recommendations. Evaluation on four public datasets demonstrates the superior performance of HCLBR over state-of-the-art baselines. Our findings highlight the significance of leveraging contrastive learning and interconnected views in bundle recommendation, providing valuable insights for marketing strategies and recommendation system design.