Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
Recommending personalized travel routes from sparse,implicit feedback poses a significant challenge,as conventional systems often struggle with information overload and fail to capture the complex,sequential nature of...Recommending personalized travel routes from sparse,implicit feedback poses a significant challenge,as conventional systems often struggle with information overload and fail to capture the complex,sequential nature of user preferences.To address this,we propose a Conditional Generative Adversarial Network(CGAN)that generates diverse and highly relevant itineraries.Our approach begins by constructing a conditional vector that encapsulates a user’s profile.This vector uniquely fuses embeddings from a Heterogeneous Information Network(HIN)to model complex user-place-route relationships,a Recurrent Neural Network(RNN)to capture sequential path dynamics,and Neural Collaborative Filtering(NCF)to incorporate collaborative signals from the wider user base.This comprehensive condition,further enhanced with features representing user interaction confidence and uncertainty,steers a CGAN stabilized by spectral normalization to generate high-fidelity latent route representations,effectively mitigating the data sparsity problem.Recommendations are then formulated using an Anchor-and-Expand algorithm,which selects relevant starting Points of Interest(POI)based on user history,then expands routes through latent similarity matching and geographic coherence optimization,culminating in Traveling Salesman Problem(TSP)-based route optimization for practical travel distances.Experiments on a real-world check-in dataset validate our model’s unique generative capability,achieving F1 scores ranging from 0.163 to 0.305,and near-zero pairs−F1 scores between 0.002 and 0.022.These results confirm the model’s success in generating novel travel routes by recommending new locations and sequences rather than replicating users’past itineraries.This work provides a robust solution for personalized travel planning,capable of generating novel and compelling routes for both new and existing users by learning from collective travel intelligence.展开更多
Precipitation nowcasting is of great importance for disaster prevention and mitigation.However,precipitation is a complex spatio-temporal phenomenon influenced by various underlying physical factors.Even slight change...Precipitation nowcasting is of great importance for disaster prevention and mitigation.However,precipitation is a complex spatio-temporal phenomenon influenced by various underlying physical factors.Even slight changes in the initial precipitation field can have a significant impact on the future precipitation patterns,making the nowcasting of short-term high-resolution precipitation a major challenge.Traditional deep learning methods often have difficulty capturing the long-term spatial dependence of precipitation and are usually at a low resolution.To address these issues,based upon the Simpler yet Better Video Prediction(SimVP)framework,we proposed a deep generative neural network that incorporates the Simple Parameter-Free Attention Module(SimAM)and Generative Adversarial Networks(GANs)for short-term high-resolution precipitation event forecasting.Through an adversarial training strategy,critical precipitation features were extracted from complex radar echo images.During the adversarial learning process,the dynamic competition between the generator and the discriminator could continuously enhance the model in prediction accuracy and resolution for short-term precipitation.Experimental results demonstrate that the proposed method could effectively forecast short-term precipitation events on various scales and showed the best overall performance among existing methods.展开更多
Recent years have witnessed the ever-increasing performance of Deep Neural Networks(DNNs)in computer vision tasks.However,researchers have identified a potential vulnerability:carefully crafted adversarial examples ca...Recent years have witnessed the ever-increasing performance of Deep Neural Networks(DNNs)in computer vision tasks.However,researchers have identified a potential vulnerability:carefully crafted adversarial examples can easily mislead DNNs into incorrect behavior via the injection of imperceptible modification to the input data.In this survey,we focus on(1)adversarial attack algorithms to generate adversarial examples,(2)adversarial defense techniques to secure DNNs against adversarial examples,and(3)important problems in the realm of adversarial examples beyond attack and defense,including the theoretical explanations,trade-off issues and benign attacks in adversarial examples.Additionally,we draw a brief comparison between recently published surveys on adversarial examples,and identify the future directions for the research of adversarial examples,such as the generalization of methods and the understanding of transferability,that might be solutions to the open problems in this field.展开更多
Transformer-based models have significantly advanced binary code similarity detection(BCSD)by leveraging their semantic encoding capabilities for efficient function matching across diverse compilation settings.Althoug...Transformer-based models have significantly advanced binary code similarity detection(BCSD)by leveraging their semantic encoding capabilities for efficient function matching across diverse compilation settings.Although adversarial examples can strategically undermine the accuracy of BCSD models and protect critical code,existing techniques predominantly depend on inserting artificial instructions,which incur high computational costs and offer limited diversity of perturbations.To address these limitations,we propose AIMA,a novel gradient-guided assembly instruction relocation method.Our method decouples the detection model into tokenization,embedding,and encoding layers to enable efficient gradient computation.Since token IDs of instructions are discrete and nondifferentiable,we compute gradients in the continuous embedding space to evaluate the influence of each token.The most critical tokens are identified by calculating the L2 norm of their embedding gradients.We then establish a mapping between instructions and their corresponding tokens to aggregate token-level importance into instructionlevel significance.To maximize adversarial impact,a sliding window algorithm selects the most influential contiguous segments for relocation,ensuring optimal perturbation with minimal length.This approach efficiently locates critical code regions without expensive search operations.The selected segments are relocated outside their original function boundaries via a jump mechanism,which preserves runtime control flow and functionality while introducing“deletion”effects in the static instruction sequence.Extensive experiments show that AIMA reduces similarity scores by up to 35.8%in state-of-the-art BCSD models.When incorporated into training data,it also enhances model robustness,achieving a 5.9%improvement in AUROC.展开更多
Tag recommendation systems can significantly improve the accuracy of information retrieval by recommending relevant tag sets that align with user preferences and resource characteristics.However,metric learning method...Tag recommendation systems can significantly improve the accuracy of information retrieval by recommending relevant tag sets that align with user preferences and resource characteristics.However,metric learning methods often suffer from high sensitivity,leading to unstable recommendation results when facing adversarial samples generated through malicious user behavior.Adversarial training is considered to be an effective method for improving the robustness of tag recommendation systems and addressing adversarial samples.However,it still faces the challenge of overfitting.Although curriculum learning-based adversarial training somewhat mitigates this issue,challenges still exist,such as the lack of a quantitative standard for attack intensity and catastrophic forgetting.To address these challenges,we propose a Self-Paced Adversarial Metric Learning(SPAML)method.First,we employ a metric learning model to capture the deep distance relationships between normal samples.Then,we incorporate a self-paced adversarial training model,which dynamically adjusts the weights of adversarial samples,allowing the model to progressively learn from simpler to more complex adversarial samples.Finally,we jointly optimize the metric learning loss and self-paced adversarial training loss in an adversarial manner,enhancing the robustness and performance of tag recommendation tasks.Extensive experiments on the MovieLens and LastFm datasets demonstrate that SPAML achieves F1@3 and NDCG@3 scores of 22%and 32.7%on the MovieLens dataset,and 19.4%and 29%on the LastFm dataset,respectively,outperforming the most competitive baselines.Specifically,F1@3 improves by 4.7%and 6.8%,and NDCG@3 improves by 5.0%and 6.9%,respectively.展开更多
The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natura...The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples.展开更多
Transfer-based Adversarial Attacks(TAAs)can deceive a victim model even without prior knowledge.This is achieved by leveraging the property of adversarial examples.That is,when generated from a surrogate model,they re...Transfer-based Adversarial Attacks(TAAs)can deceive a victim model even without prior knowledge.This is achieved by leveraging the property of adversarial examples.That is,when generated from a surrogate model,they retain their features if applied to other models due to their good transferability.However,adversarial examples often exhibit overfitting,as they are tailored to exploit the particular architecture and feature representation of source models.Consequently,when attempting black-box transfer attacks on different target models,their effectiveness is decreased.To solve this problem,this study proposes an approach based on a Regularized Constrained Feature Layer(RCFL).The proposed method first uses regularization constraints to attenuate the initial examples of low-frequency components.Perturbations are then added to a pre-specified layer of the source model using the back-propagation technique,in order to modify the original adversarial examples.Afterward,a regularized loss function is used to enhance the black-box transferability between different target models.The proposed method is finally tested on the ImageNet,CIFAR-100,and Stanford Car datasets with various target models,The obtained results demonstrate that it achieves a significantly higher transfer-based adversarial attack success rate compared with baseline techniques.展开更多
The Internet of Things(IoT)is integral to modern infrastructure,enabling connectivity among a wide range of devices from home automation to industrial control systems.With the exponential increase in data generated by...The Internet of Things(IoT)is integral to modern infrastructure,enabling connectivity among a wide range of devices from home automation to industrial control systems.With the exponential increase in data generated by these interconnected devices,robust anomaly detection mechanisms are essential.Anomaly detection in this dynamic environment necessitates methods that can accurately distinguish between normal and anomalous behavior by learning intricate patterns.This paper presents a novel approach utilizing generative adversarial networks(GANs)for anomaly detection in IoT systems.However,optimizing GANs involves tuning hyper-parameters such as learning rate,batch size,and optimization algorithms,which can be challenging due to the non-convex nature of GAN loss functions.To address this,we propose a five-dimensional Gray wolf optimizer(5DGWO)to optimize GAN hyper-parameters.The 5DGWO introduces two new types of wolves:gamma(γ)for improved exploitation and convergence,and theta(θ)for enhanced exploration and escaping local minima.The proposed system framework comprises four key stages:1)preprocessing,2)generative model training,3)autoencoder(AE)training,and 4)predictive model training.The generative models are utilized to assist the AE training,and the final predictive models(including convolutional neural network(CNN),deep belief network(DBN),recurrent neural network(RNN),random forest(RF),and extreme gradient boosting(XGBoost))are trained using the generated data and AE-encoded features.We evaluated the system on three benchmark datasets:NSL-KDD,UNSW-NB15,and IoT-23.Experiments conducted on diverse IoT datasets show that our method outperforms existing anomaly detection strategies and significantly reduces false positives.The 5DGWO-GAN-CNNAE exhibits superior performance in various metrics,including accuracy,recall,precision,root mean square error(RMSE),and convergence trend.The proposed 5DGWO-GAN-CNNAE achieved the lowest RMSE values across the NSL-KDD,UNSW-NB15,and IoT-23 datasets,with values of 0.24,1.10,and 0.09,respectively.Additionally,it attained the highest accuracy,ranging from 94%to 100%.These results suggest a promising direction for future IoT security frameworks,offering a scalable and efficient solution to safeguard against evolving cyber threats.展开更多
In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and th...In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and the gradient penalty to achieve data augmentation.To lower the threshold for implementing the proposed method,transfer learning is further introduced.The W-DC-GAN-GP-TL framework is thereby formed.This framework is evaluated on 3 different publicly available datasets to judge the quality of generated data.Through visual comparisons and the examination of two visualization methods(probability density function(PDF)and principal component analysis(PCA)),it is demonstrated that the generated data is hard to distinguish from the real data.The application of generated data for training a battery state model using transfer learning is further evaluated.Specifically,Bi-GRU-based and Transformer-based methods are implemented on 2 separate datasets for estimating state of health(SOH)and state of charge(SOC),respectively.The results indicate that the proposed framework demonstrates satisfactory performance in different scenarios:for the data replacement scenario,where real data are removed and replaced with generated data,the state estimator accuracy decreases only slightly;for the data enhancement scenario,the estimator accuracy is further improved.The estimation accuracy of SOH and SOC is as low as 0.69%and 0.58%root mean square error(RMSE)after applying the proposed framework.This framework provides a reliable method for enriching battery measurement data.It is a generalized framework capable of generating a variety of time series data.展开更多
Deep neural networks are extremely vulnerable to externalities from intentionally generated adversarial examples which are achieved by overlaying tiny noise on the clean images.However,most existing transfer-based att...Deep neural networks are extremely vulnerable to externalities from intentionally generated adversarial examples which are achieved by overlaying tiny noise on the clean images.However,most existing transfer-based attack methods are chosen to add perturbations on each pixel of the original image with the same weight,resulting in redundant noise in the adversarial examples,which makes them easier to be detected.Given this deliberation,a novel attentionguided sparse adversarial attack strategy with gradient dropout that can be readily incorporated with existing gradient-based methods is introduced to minimize the intensity and the scale of perturbations and ensure the effectiveness of adversarial examples at the same time.Specifically,in the gradient dropout phase,some relatively unimportant gradient information is randomly discarded to limit the intensity of the perturbation.In the attentionguided phase,the influence of each pixel on the model output is evaluated by using a soft mask-refined attention mechanism,and the perturbation of those pixels with smaller influence is limited to restrict the scale of the perturbation.After conducting thorough experiments on the NeurIPS 2017 adversarial dataset and the ILSVRC 2012 validation dataset,the proposed strategy holds the potential to significantly diminish the superfluous noise present in adversarial examples,all while keeping their attack efficacy intact.For instance,in attacks on adversarially trained models,upon the integration of the strategy,the average level of noise injected into images experiences a decline of 8.32%.However,the average attack success rate decreases by only 0.34%.Furthermore,the competence is possessed to substantially elevate the attack success rate by merely introducing a slight degree of perturbation.展开更多
Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed input...Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed inputs that manipulate model outputs–presents substantial challenges.This paper introduces a classification-based approach to detect adversarial prompts by utilizing both prompt features and prompt response features.Elevenmachine learning models were evaluated based on key metrics such as accuracy,precision,recall,and F1-score.The results show that the Convolutional Neural Network–Long Short-Term Memory(CNN-LSTM)cascade model delivers the best performance,especially when using prompt features,achieving an accuracy of over 97%in all adversarial scenarios.Furthermore,the Support Vector Machine(SVM)model performed best with prompt response features,particularly excelling in prompt type classification tasks.Classification results revealed that certain types of adversarial attacks,such as“Word Level”and“Adversarial Prefix”,were particularly difficult to detect,as indicated by their low recall and F1-scores.These findings suggest that more subtle manipulations can evade detection mechanisms.In contrast,attacks like“Sentence Level”and“Adversarial Insertion”were easier to identify,due to the model’s effectiveness in recognizing inserted content.Natural Language Processing(NLP)techniques played a critical role by enabling the extraction of semantic and syntactic features from both prompts and their corresponding responses.These insights highlight the importance of combining traditional and deep learning approaches,along with advanced NLP techniques,to build more reliable adversarial prompt detection systems for LLMs.展开更多
Existing imaging techniques cannot simultaneously achieve high resolution and a wide field of view,and manual multi-mineral segmentation in shale lacks precision.To address these limitations,we propose a comprehensive...Existing imaging techniques cannot simultaneously achieve high resolution and a wide field of view,and manual multi-mineral segmentation in shale lacks precision.To address these limitations,we propose a comprehensive framework based on generative adversarial network(GAN)for characterizing pore structure properties of shale,which incorporates image augmentation,super-resolution reconstruction,and multi-mineral auto-segmentation.Using real 2D and 3D shale images,the framework was assessed through correlation function,entropy,porosity,pore size distribution,and permeability.The application results show that this framework enables the enhancement of 3D low-resolution digital cores by a scale factor of 8,without paired shale images,effectively reconstructing the unresolved fine-scale pores under a low resolution,rather than merely denoising,deblurring,and edge clarification.The trained GAN-based segmentation model effectively improves manual multi-mineral segmentation results,resulting in a strong resemblance to real samples in terms of pore size distribution and permeability.This framework significantly improves the characterization of complex shale microstructures and can be expanded to other heterogeneous porous media,such as carbonate,coal,and tight sandstone reservoirs.展开更多
Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional...Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional oversampling methods often generate synthetic samples without considering density variations,leading to redundant or misleading instances that exacerbate class overlap in high-density regions.To address these limitations,we propose Wasserstein Generative Adversarial Network Variational Density Estimation WGAN-VDE,a computationally efficient density-aware adversarial resampling framework that enhances minority class representation while strategically reducing class overlap.The originality of WGAN-VDE lies in its density-aware sample refinement,ensuring that synthetic samples are positioned in underrepresented regions,thereby improving class distinctiveness.By applying structured feature representation,targeted sample generation,and density-based selection mechanisms strategies,the proposed framework ensures the generation of well-separated and diverse synthetic samples,improving class separability and reducing redundancy.The experimental evaluation on 20 benchmark datasets demonstrates that this approach outperforms 11 state-of-the-art rebalancing techniques,achieving superior results in F1-score,Accuracy,G-Mean,and AUC metrics.These results establish the proposed method as an effective and robust computational approach,suitable for diverse engineering and scientific applications involving imbalanced data classification and computational modeling.展开更多
The development of generative architectures has resulted in numerous novel deep-learning models that generate images using text inputs.However,humans naturally use speech for visualization prompts.Therefore,this paper...The development of generative architectures has resulted in numerous novel deep-learning models that generate images using text inputs.However,humans naturally use speech for visualization prompts.Therefore,this paper proposes an architecture that integrates speech prompts as input to image-generation Generative Adversarial Networks(GANs)model,leveraging Speech-to-Text translation along with the CLIP+VQGAN model.The proposed method involves translating speech prompts into text,which is then used by the Contrastive Language-Image Pretraining(CLIP)+Vector Quantized Generative Adversarial Network(VQGAN)model to generate images.This paper outlines the steps required to implement such a model and describes in detail the methods used for evaluating the model.The GAN model successfully generates artwork from descriptions using speech and text prompts.Experimental outcomes of synthesized images demonstrate that the proposed methodology can produce beautiful abstract visuals containing elements from the input prompts.The model achieved a Frechet Inception Distance(FID)score of 28.75,showcasing its capability to produce high-quality and diverse images.The proposed model can find numerous applications in educational,artistic,and design spaces due to its ability to generate images using speech and the distinct abstract artistry of the output images.This capability is demonstrated by giving the model out-of-the-box prompts to generate never-before-seen images with plausible realistic qualities.展开更多
In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers...In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers in the experiment.In this paper,a simple attack method called incomplete physical adversarial attack(IPAA)is proposed to simulate physical attacks.Different from the process of physical attacks,when an IPAA is conducted,a photo of the adversarial sticker is embedded into a facial image as the input to attack FR systems,which can obtain results similar to those of physical attacks without inviting any volunteers.The results show that IPAA has a higher similarity with physical attacks than digital attacks,indicating that IPAA is able to evaluate the performance of physical attacks.IPAA is effective in quantitatively measuring the impact of the sticker location on the results of attacks.展开更多
With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularl...With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularly by infringing on individuals’privacy.Amid significant endeavors to fabricate systems for identifying deepfake fabrications,existing methodologies often face hurdles in adjusting to innovative forgery techniques and demonstrate increased vulnerability to image and video clarity variations,thereby hindering their broad applicability to images and videos produced by unfamiliar technologies.In this manuscript,we endorse resilient training tactics to amplify generalization capabilities.In adversarial training,models are trained using deliberately crafted samples to deceive classification systems,thereby significantly enhancing their generalization ability.In response to this challenge,we propose an innovative hybrid adversarial training framework integrating Virtual Adversarial Training(VAT)with Two-Generated Blurred Adversarial Training.This combined framework bolsters the model’s resilience in detecting deepfakes made using unfamiliar deep learning technologies.Through such adversarial training,models are prompted to acquire more versatile attributes.Through experimental studies,we demonstrate that our model achieves higher accuracy than existing models.展开更多
Hydrogen energy is a crucial support for China’s low-carbon energy transition.With the large-scale integration of renewable energy,the combination of hydrogen and integrated energy systems has become one of the most ...Hydrogen energy is a crucial support for China’s low-carbon energy transition.With the large-scale integration of renewable energy,the combination of hydrogen and integrated energy systems has become one of the most promising directions of development.This paper proposes an optimized schedulingmodel for a hydrogen-coupled electro-heat-gas integrated energy system(HCEHG-IES)using generative adversarial imitation learning(GAIL).The model aims to enhance renewable-energy absorption,reduce carbon emissions,and improve grid-regulation flexibility.First,the optimal scheduling problem of HCEHG-IES under uncertainty is modeled as a Markov decision process(MDP).To overcome the limitations of conventional deep reinforcement learning algorithms—including long optimization time,slow convergence,and subjective reward design—this study augments the PPO algorithm by incorporating a discriminator network and expert data.The newly developed algorithm,termed GAIL,enables the agent to perform imitation learning from expert data.Based on this model,dynamic scheduling decisions are made in continuous state and action spaces,generating optimal energy-allocation and management schemes.Simulation results indicate that,compared with traditional reinforcement-learning algorithms,the proposed algorithmoffers better economic performance.Guided by expert data,the agent avoids blind optimization,shortens the offline training time,and improves convergence performance.In the online phase,the algorithm enables flexible energy utilization,thereby promoting renewable-energy absorption and reducing carbon emissions.展开更多
This article proposes an innovative adversarial attack method,AMA(Adaptive Multimodal Attack),which introduces an adaptive feedback mechanism by dynamically adjusting the perturbation strength.Specifically,AMA adjusts...This article proposes an innovative adversarial attack method,AMA(Adaptive Multimodal Attack),which introduces an adaptive feedback mechanism by dynamically adjusting the perturbation strength.Specifically,AMA adjusts perturbation amplitude based on task complexity and optimizes the perturbation direction based on the gradient direction in real time to enhance attack efficiency.Experimental results demonstrate that AMA elevates attack success rates from approximately 78.95%to 89.56%on visual question answering and from78.82%to 84.96%on visual reasoning tasks across representative vision-language benchmarks.These findings demonstrate AMA’s superior attack efficiency and reveal the vulnerability of current visual language models to carefully crafted adversarial examples,underscoring the need to enhance their robustness.展开更多
Adversarial attacks pose a significant threat to artificial intelligence systems by exposing them to vulnerabilities in deep learning models.Existing defense mechanisms often suffer drawbacks,such as the need for mode...Adversarial attacks pose a significant threat to artificial intelligence systems by exposing them to vulnerabilities in deep learning models.Existing defense mechanisms often suffer drawbacks,such as the need for model retraining,significant inference time overhead,and limited effectiveness against specific attack types.Achieving perfect defense against adversarial attacks remains elusive,emphasizing the importance of mitigation strategies.In this study,we propose a defense mechanism that applies random cropping and Gaussian filtering to input images to mitigate the impact of adversarial attacks.First,the image was randomly cropped to vary its dimensions and then placed at the center of a fixed 299299 space,with the remaining areas filled with zero padding.Subsequently,Gaussian×filtering with a 77 kernel and a standard deviation of two was applied using a convolution operation.Finally,the×smoothed image was fed into the classification model.The proposed defense method consistently appeared in the upperright region across all attack scenarios,demonstrating its ability to preserve classification performance on clean images while significantly mitigating adversarial attacks.This visualization confirms that the proposed method is effective and reliable for defending against adversarial perturbations.Moreover,the proposed method incurs minimal computational overhead,making it suitable for real-time applications.Furthermore,owing to its model-agnostic nature,the proposed method can be easily incorporated into various neural network architectures,serving as a fundamental module for adversarial defense strategies.展开更多
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金supported by the Chung-Ang University Research Grants in 2023.Alsothe work is supported by the ELLIIT Excellence Center at Linköping–Lund in Information Technology in Sweden.
文摘Recommending personalized travel routes from sparse,implicit feedback poses a significant challenge,as conventional systems often struggle with information overload and fail to capture the complex,sequential nature of user preferences.To address this,we propose a Conditional Generative Adversarial Network(CGAN)that generates diverse and highly relevant itineraries.Our approach begins by constructing a conditional vector that encapsulates a user’s profile.This vector uniquely fuses embeddings from a Heterogeneous Information Network(HIN)to model complex user-place-route relationships,a Recurrent Neural Network(RNN)to capture sequential path dynamics,and Neural Collaborative Filtering(NCF)to incorporate collaborative signals from the wider user base.This comprehensive condition,further enhanced with features representing user interaction confidence and uncertainty,steers a CGAN stabilized by spectral normalization to generate high-fidelity latent route representations,effectively mitigating the data sparsity problem.Recommendations are then formulated using an Anchor-and-Expand algorithm,which selects relevant starting Points of Interest(POI)based on user history,then expands routes through latent similarity matching and geographic coherence optimization,culminating in Traveling Salesman Problem(TSP)-based route optimization for practical travel distances.Experiments on a real-world check-in dataset validate our model’s unique generative capability,achieving F1 scores ranging from 0.163 to 0.305,and near-zero pairs−F1 scores between 0.002 and 0.022.These results confirm the model’s success in generating novel travel routes by recommending new locations and sequences rather than replicating users’past itineraries.This work provides a robust solution for personalized travel planning,capable of generating novel and compelling routes for both new and existing users by learning from collective travel intelligence.
基金Supported by the National Natural Science Foundation of China(No.42306214)the Postdoctoral Innovative Talents Support Program of Shandong Province(No.SDBX2022026)+1 种基金the China Postdoctoral Science Foundation(No.2023M733533)the Special Research Assistant Project of the Chinese Academy of Sciences in 2022。
文摘Precipitation nowcasting is of great importance for disaster prevention and mitigation.However,precipitation is a complex spatio-temporal phenomenon influenced by various underlying physical factors.Even slight changes in the initial precipitation field can have a significant impact on the future precipitation patterns,making the nowcasting of short-term high-resolution precipitation a major challenge.Traditional deep learning methods often have difficulty capturing the long-term spatial dependence of precipitation and are usually at a low resolution.To address these issues,based upon the Simpler yet Better Video Prediction(SimVP)framework,we proposed a deep generative neural network that incorporates the Simple Parameter-Free Attention Module(SimAM)and Generative Adversarial Networks(GANs)for short-term high-resolution precipitation event forecasting.Through an adversarial training strategy,critical precipitation features were extracted from complex radar echo images.During the adversarial learning process,the dynamic competition between the generator and the discriminator could continuously enhance the model in prediction accuracy and resolution for short-term precipitation.Experimental results demonstrate that the proposed method could effectively forecast short-term precipitation events on various scales and showed the best overall performance among existing methods.
基金Supported by the National Natural Science Foundation of China(U1903214,62372339,62371350,61876135)the Ministry of Education Industry University Cooperative Education Project(202102246004,220800006041043,202002142012)the Fundamental Research Funds for the Central Universities(2042023kf1033)。
文摘Recent years have witnessed the ever-increasing performance of Deep Neural Networks(DNNs)in computer vision tasks.However,researchers have identified a potential vulnerability:carefully crafted adversarial examples can easily mislead DNNs into incorrect behavior via the injection of imperceptible modification to the input data.In this survey,we focus on(1)adversarial attack algorithms to generate adversarial examples,(2)adversarial defense techniques to secure DNNs against adversarial examples,and(3)important problems in the realm of adversarial examples beyond attack and defense,including the theoretical explanations,trade-off issues and benign attacks in adversarial examples.Additionally,we draw a brief comparison between recently published surveys on adversarial examples,and identify the future directions for the research of adversarial examples,such as the generalization of methods and the understanding of transferability,that might be solutions to the open problems in this field.
基金supported by Key Laboratory of Cyberspace Security,Ministry of Education,China。
文摘Transformer-based models have significantly advanced binary code similarity detection(BCSD)by leveraging their semantic encoding capabilities for efficient function matching across diverse compilation settings.Although adversarial examples can strategically undermine the accuracy of BCSD models and protect critical code,existing techniques predominantly depend on inserting artificial instructions,which incur high computational costs and offer limited diversity of perturbations.To address these limitations,we propose AIMA,a novel gradient-guided assembly instruction relocation method.Our method decouples the detection model into tokenization,embedding,and encoding layers to enable efficient gradient computation.Since token IDs of instructions are discrete and nondifferentiable,we compute gradients in the continuous embedding space to evaluate the influence of each token.The most critical tokens are identified by calculating the L2 norm of their embedding gradients.We then establish a mapping between instructions and their corresponding tokens to aggregate token-level importance into instructionlevel significance.To maximize adversarial impact,a sliding window algorithm selects the most influential contiguous segments for relocation,ensuring optimal perturbation with minimal length.This approach efficiently locates critical code regions without expensive search operations.The selected segments are relocated outside their original function boundaries via a jump mechanism,which preserves runtime control flow and functionality while introducing“deletion”effects in the static instruction sequence.Extensive experiments show that AIMA reduces similarity scores by up to 35.8%in state-of-the-art BCSD models.When incorporated into training data,it also enhances model robustness,achieving a 5.9%improvement in AUROC.
基金supported by the Key Research and Development Program of Zhejiang Province(No.2024C01071)the Natural Science Foundation of Zhejiang Province(No.LQ15F030006).
文摘Tag recommendation systems can significantly improve the accuracy of information retrieval by recommending relevant tag sets that align with user preferences and resource characteristics.However,metric learning methods often suffer from high sensitivity,leading to unstable recommendation results when facing adversarial samples generated through malicious user behavior.Adversarial training is considered to be an effective method for improving the robustness of tag recommendation systems and addressing adversarial samples.However,it still faces the challenge of overfitting.Although curriculum learning-based adversarial training somewhat mitigates this issue,challenges still exist,such as the lack of a quantitative standard for attack intensity and catastrophic forgetting.To address these challenges,we propose a Self-Paced Adversarial Metric Learning(SPAML)method.First,we employ a metric learning model to capture the deep distance relationships between normal samples.Then,we incorporate a self-paced adversarial training model,which dynamically adjusts the weights of adversarial samples,allowing the model to progressively learn from simpler to more complex adversarial samples.Finally,we jointly optimize the metric learning loss and self-paced adversarial training loss in an adversarial manner,enhancing the robustness and performance of tag recommendation tasks.Extensive experiments on the MovieLens and LastFm datasets demonstrate that SPAML achieves F1@3 and NDCG@3 scores of 22%and 32.7%on the MovieLens dataset,and 19.4%and 29%on the LastFm dataset,respectively,outperforming the most competitive baselines.Specifically,F1@3 improves by 4.7%and 6.8%,and NDCG@3 improves by 5.0%and 6.9%,respectively.
文摘The emergence of adversarial examples has revealed the inadequacies in the robustness of image classification models based on Convolutional Neural Networks (CNNs). Particularly in recent years, the discovery of natural adversarial examples has posed significant challenges, as traditional defense methods against adversarial attacks have proven to be largely ineffective against these natural adversarial examples. This paper explores defenses against these natural adversarial examples from three perspectives: adversarial examples, model architecture, and dataset. First, it employs Class Activation Mapping (CAM) to visualize how models classify natural adversarial examples, identifying several typical attack patterns. Next, various common CNN models are analyzed to evaluate their susceptibility to these attacks, revealing that different architectures exhibit varying defensive capabilities. The study finds that as the depth of a network increases, its defenses against natural adversarial examples strengthen. Lastly, Finally, the impact of dataset class distribution on the defense capability of models is examined, focusing on two aspects: the number of classes in the training set and the number of predicted classes. This study investigates how these factors influence the model’s ability to defend against natural adversarial examples. Results indicate that reducing the number of training classes enhances the model’s defense against natural adversarial examples. Additionally, under a fixed number of training classes, some CNN models show an optimal range of predicted classes for achieving the best defense performance against these adversarial examples.
基金supported by the Intelligent Policing Key Laboratory of Sichuan Province(No.ZNJW2022KFZD002)This work was supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission(Grant Nos.KJQN202302403,KJQN202303111).
文摘Transfer-based Adversarial Attacks(TAAs)can deceive a victim model even without prior knowledge.This is achieved by leveraging the property of adversarial examples.That is,when generated from a surrogate model,they retain their features if applied to other models due to their good transferability.However,adversarial examples often exhibit overfitting,as they are tailored to exploit the particular architecture and feature representation of source models.Consequently,when attempting black-box transfer attacks on different target models,their effectiveness is decreased.To solve this problem,this study proposes an approach based on a Regularized Constrained Feature Layer(RCFL).The proposed method first uses regularization constraints to attenuate the initial examples of low-frequency components.Perturbations are then added to a pre-specified layer of the source model using the back-propagation technique,in order to modify the original adversarial examples.Afterward,a regularized loss function is used to enhance the black-box transferability between different target models.The proposed method is finally tested on the ImageNet,CIFAR-100,and Stanford Car datasets with various target models,The obtained results demonstrate that it achieves a significantly higher transfer-based adversarial attack success rate compared with baseline techniques.
基金described in this paper has been developed with in the project PRESECREL(PID2021-124502OB-C43)。
文摘The Internet of Things(IoT)is integral to modern infrastructure,enabling connectivity among a wide range of devices from home automation to industrial control systems.With the exponential increase in data generated by these interconnected devices,robust anomaly detection mechanisms are essential.Anomaly detection in this dynamic environment necessitates methods that can accurately distinguish between normal and anomalous behavior by learning intricate patterns.This paper presents a novel approach utilizing generative adversarial networks(GANs)for anomaly detection in IoT systems.However,optimizing GANs involves tuning hyper-parameters such as learning rate,batch size,and optimization algorithms,which can be challenging due to the non-convex nature of GAN loss functions.To address this,we propose a five-dimensional Gray wolf optimizer(5DGWO)to optimize GAN hyper-parameters.The 5DGWO introduces two new types of wolves:gamma(γ)for improved exploitation and convergence,and theta(θ)for enhanced exploration and escaping local minima.The proposed system framework comprises four key stages:1)preprocessing,2)generative model training,3)autoencoder(AE)training,and 4)predictive model training.The generative models are utilized to assist the AE training,and the final predictive models(including convolutional neural network(CNN),deep belief network(DBN),recurrent neural network(RNN),random forest(RF),and extreme gradient boosting(XGBoost))are trained using the generated data and AE-encoded features.We evaluated the system on three benchmark datasets:NSL-KDD,UNSW-NB15,and IoT-23.Experiments conducted on diverse IoT datasets show that our method outperforms existing anomaly detection strategies and significantly reduces false positives.The 5DGWO-GAN-CNNAE exhibits superior performance in various metrics,including accuracy,recall,precision,root mean square error(RMSE),and convergence trend.The proposed 5DGWO-GAN-CNNAE achieved the lowest RMSE values across the NSL-KDD,UNSW-NB15,and IoT-23 datasets,with values of 0.24,1.10,and 0.09,respectively.Additionally,it attained the highest accuracy,ranging from 94%to 100%.These results suggest a promising direction for future IoT security frameworks,offering a scalable and efficient solution to safeguard against evolving cyber threats.
基金funded by the Bavarian State Ministry of Science,Research and Art(Grant number:H.2-F1116.WE/52/2)。
文摘In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and the gradient penalty to achieve data augmentation.To lower the threshold for implementing the proposed method,transfer learning is further introduced.The W-DC-GAN-GP-TL framework is thereby formed.This framework is evaluated on 3 different publicly available datasets to judge the quality of generated data.Through visual comparisons and the examination of two visualization methods(probability density function(PDF)and principal component analysis(PCA)),it is demonstrated that the generated data is hard to distinguish from the real data.The application of generated data for training a battery state model using transfer learning is further evaluated.Specifically,Bi-GRU-based and Transformer-based methods are implemented on 2 separate datasets for estimating state of health(SOH)and state of charge(SOC),respectively.The results indicate that the proposed framework demonstrates satisfactory performance in different scenarios:for the data replacement scenario,where real data are removed and replaced with generated data,the state estimator accuracy decreases only slightly;for the data enhancement scenario,the estimator accuracy is further improved.The estimation accuracy of SOH and SOC is as low as 0.69%and 0.58%root mean square error(RMSE)after applying the proposed framework.This framework provides a reliable method for enriching battery measurement data.It is a generalized framework capable of generating a variety of time series data.
基金Fundamental Research Funds for the Central Universities,China(No.2232021A-10)Shanghai Sailing Program,China(No.22YF1401300)+1 种基金Natural Science Foundation of Shanghai,China(No.20ZR1400400)Shanghai Pujiang Program,China(No.22PJ1423400)。
文摘Deep neural networks are extremely vulnerable to externalities from intentionally generated adversarial examples which are achieved by overlaying tiny noise on the clean images.However,most existing transfer-based attack methods are chosen to add perturbations on each pixel of the original image with the same weight,resulting in redundant noise in the adversarial examples,which makes them easier to be detected.Given this deliberation,a novel attentionguided sparse adversarial attack strategy with gradient dropout that can be readily incorporated with existing gradient-based methods is introduced to minimize the intensity and the scale of perturbations and ensure the effectiveness of adversarial examples at the same time.Specifically,in the gradient dropout phase,some relatively unimportant gradient information is randomly discarded to limit the intensity of the perturbation.In the attentionguided phase,the influence of each pixel on the model output is evaluated by using a soft mask-refined attention mechanism,and the perturbation of those pixels with smaller influence is limited to restrict the scale of the perturbation.After conducting thorough experiments on the NeurIPS 2017 adversarial dataset and the ILSVRC 2012 validation dataset,the proposed strategy holds the potential to significantly diminish the superfluous noise present in adversarial examples,all while keeping their attack efficacy intact.For instance,in attacks on adversarially trained models,upon the integration of the strategy,the average level of noise injected into images experiences a decline of 8.32%.However,the average attack success rate decreases by only 0.34%.Furthermore,the competence is possessed to substantially elevate the attack success rate by merely introducing a slight degree of perturbation.
文摘Large Language Models(LLMs)have significantly advanced human-computer interaction by improving natural language understanding and generation.However,their vulnerability to adversarial prompts–carefully designed inputs that manipulate model outputs–presents substantial challenges.This paper introduces a classification-based approach to detect adversarial prompts by utilizing both prompt features and prompt response features.Elevenmachine learning models were evaluated based on key metrics such as accuracy,precision,recall,and F1-score.The results show that the Convolutional Neural Network–Long Short-Term Memory(CNN-LSTM)cascade model delivers the best performance,especially when using prompt features,achieving an accuracy of over 97%in all adversarial scenarios.Furthermore,the Support Vector Machine(SVM)model performed best with prompt response features,particularly excelling in prompt type classification tasks.Classification results revealed that certain types of adversarial attacks,such as“Word Level”and“Adversarial Prefix”,were particularly difficult to detect,as indicated by their low recall and F1-scores.These findings suggest that more subtle manipulations can evade detection mechanisms.In contrast,attacks like“Sentence Level”and“Adversarial Insertion”were easier to identify,due to the model’s effectiveness in recognizing inserted content.Natural Language Processing(NLP)techniques played a critical role by enabling the extraction of semantic and syntactic features from both prompts and their corresponding responses.These insights highlight the importance of combining traditional and deep learning approaches,along with advanced NLP techniques,to build more reliable adversarial prompt detection systems for LLMs.
基金Supported by the National Natural Science Foundation of China(U23A20595,52034010,52288101)National Key Research and Development Program of China(2022YFE0203400)+1 种基金Shandong Provincial Natural Science Foundation(ZR2024ZD17)Fundamental Research Funds for the Central Universities(23CX10004A).
文摘Existing imaging techniques cannot simultaneously achieve high resolution and a wide field of view,and manual multi-mineral segmentation in shale lacks precision.To address these limitations,we propose a comprehensive framework based on generative adversarial network(GAN)for characterizing pore structure properties of shale,which incorporates image augmentation,super-resolution reconstruction,and multi-mineral auto-segmentation.Using real 2D and 3D shale images,the framework was assessed through correlation function,entropy,porosity,pore size distribution,and permeability.The application results show that this framework enables the enhancement of 3D low-resolution digital cores by a scale factor of 8,without paired shale images,effectively reconstructing the unresolved fine-scale pores under a low resolution,rather than merely denoising,deblurring,and edge clarification.The trained GAN-based segmentation model effectively improves manual multi-mineral segmentation results,resulting in a strong resemblance to real samples in terms of pore size distribution and permeability.This framework significantly improves the characterization of complex shale microstructures and can be expanded to other heterogeneous porous media,such as carbonate,coal,and tight sandstone reservoirs.
基金supported by Ongoing Research Funding Program(ORF-2025-488)King Saud University,Riyadh,Saudi Arabia.
文摘Effectively handling imbalanced datasets remains a fundamental challenge in computational modeling and machine learning,particularly when class overlap significantly deteriorates classification performance.Traditional oversampling methods often generate synthetic samples without considering density variations,leading to redundant or misleading instances that exacerbate class overlap in high-density regions.To address these limitations,we propose Wasserstein Generative Adversarial Network Variational Density Estimation WGAN-VDE,a computationally efficient density-aware adversarial resampling framework that enhances minority class representation while strategically reducing class overlap.The originality of WGAN-VDE lies in its density-aware sample refinement,ensuring that synthetic samples are positioned in underrepresented regions,thereby improving class distinctiveness.By applying structured feature representation,targeted sample generation,and density-based selection mechanisms strategies,the proposed framework ensures the generation of well-separated and diverse synthetic samples,improving class separability and reducing redundancy.The experimental evaluation on 20 benchmark datasets demonstrates that this approach outperforms 11 state-of-the-art rebalancing techniques,achieving superior results in F1-score,Accuracy,G-Mean,and AUC metrics.These results establish the proposed method as an effective and robust computational approach,suitable for diverse engineering and scientific applications involving imbalanced data classification and computational modeling.
基金funded by the Centre for Advanced Modelling and Geospatial Information Systems(CAMGIS),Faculty of Engineering and IT,University of Technology SydneyMoreover,supported by the Researchers Supporting Project,King Saud University,Riyadh,Saudi Arabia,under Ongoing Research Funding(ORF-2025-14).
文摘The development of generative architectures has resulted in numerous novel deep-learning models that generate images using text inputs.However,humans naturally use speech for visualization prompts.Therefore,this paper proposes an architecture that integrates speech prompts as input to image-generation Generative Adversarial Networks(GANs)model,leveraging Speech-to-Text translation along with the CLIP+VQGAN model.The proposed method involves translating speech prompts into text,which is then used by the Contrastive Language-Image Pretraining(CLIP)+Vector Quantized Generative Adversarial Network(VQGAN)model to generate images.This paper outlines the steps required to implement such a model and describes in detail the methods used for evaluating the model.The GAN model successfully generates artwork from descriptions using speech and text prompts.Experimental outcomes of synthesized images demonstrate that the proposed methodology can produce beautiful abstract visuals containing elements from the input prompts.The model achieved a Frechet Inception Distance(FID)score of 28.75,showcasing its capability to produce high-quality and diverse images.The proposed model can find numerous applications in educational,artistic,and design spaces due to its ability to generate images using speech and the distinct abstract artistry of the output images.This capability is demonstrated by giving the model out-of-the-box prompts to generate never-before-seen images with plausible realistic qualities.
文摘In recent work,adversarial stickers are widely used to attack face recognition(FR)systems in the physical world.However,it is difficult to evaluate the performance of physical attacks because of the lack of volunteers in the experiment.In this paper,a simple attack method called incomplete physical adversarial attack(IPAA)is proposed to simulate physical attacks.Different from the process of physical attacks,when an IPAA is conducted,a photo of the adversarial sticker is embedded into a facial image as the input to attack FR systems,which can obtain results similar to those of physical attacks without inviting any volunteers.The results show that IPAA has a higher similarity with physical attacks than digital attacks,indicating that IPAA is able to evaluate the performance of physical attacks.IPAA is effective in quantitatively measuring the impact of the sticker location on the results of attacks.
基金supported by King Saud University,Riyadh,Saudi Arabia,through the Researchers Supporting Project under Grant RSP2025R493。
文摘With expeditious advancements in AI-driven facial manipulation techniques,particularly deepfake technology,there is growing concern over its potential misuse.Deepfakes pose a significant threat to society,partic-ularly by infringing on individuals’privacy.Amid significant endeavors to fabricate systems for identifying deepfake fabrications,existing methodologies often face hurdles in adjusting to innovative forgery techniques and demonstrate increased vulnerability to image and video clarity variations,thereby hindering their broad applicability to images and videos produced by unfamiliar technologies.In this manuscript,we endorse resilient training tactics to amplify generalization capabilities.In adversarial training,models are trained using deliberately crafted samples to deceive classification systems,thereby significantly enhancing their generalization ability.In response to this challenge,we propose an innovative hybrid adversarial training framework integrating Virtual Adversarial Training(VAT)with Two-Generated Blurred Adversarial Training.This combined framework bolsters the model’s resilience in detecting deepfakes made using unfamiliar deep learning technologies.Through such adversarial training,models are prompted to acquire more versatile attributes.Through experimental studies,we demonstrate that our model achieves higher accuracy than existing models.
基金supported by State Grid Corporation Technology Project(No.522437250003).
文摘Hydrogen energy is a crucial support for China’s low-carbon energy transition.With the large-scale integration of renewable energy,the combination of hydrogen and integrated energy systems has become one of the most promising directions of development.This paper proposes an optimized schedulingmodel for a hydrogen-coupled electro-heat-gas integrated energy system(HCEHG-IES)using generative adversarial imitation learning(GAIL).The model aims to enhance renewable-energy absorption,reduce carbon emissions,and improve grid-regulation flexibility.First,the optimal scheduling problem of HCEHG-IES under uncertainty is modeled as a Markov decision process(MDP).To overcome the limitations of conventional deep reinforcement learning algorithms—including long optimization time,slow convergence,and subjective reward design—this study augments the PPO algorithm by incorporating a discriminator network and expert data.The newly developed algorithm,termed GAIL,enables the agent to perform imitation learning from expert data.Based on this model,dynamic scheduling decisions are made in continuous state and action spaces,generating optimal energy-allocation and management schemes.Simulation results indicate that,compared with traditional reinforcement-learning algorithms,the proposed algorithmoffers better economic performance.Guided by expert data,the agent avoids blind optimization,shortens the offline training time,and improves convergence performance.In the online phase,the algorithm enables flexible energy utilization,thereby promoting renewable-energy absorption and reducing carbon emissions.
基金funded by the Natural Science Foundation of Jiangsu Province(Program BK20240699)National Natural Science Foundation of China(Program 62402228).
文摘This article proposes an innovative adversarial attack method,AMA(Adaptive Multimodal Attack),which introduces an adaptive feedback mechanism by dynamically adjusting the perturbation strength.Specifically,AMA adjusts perturbation amplitude based on task complexity and optimizes the perturbation direction based on the gradient direction in real time to enhance attack efficiency.Experimental results demonstrate that AMA elevates attack success rates from approximately 78.95%to 89.56%on visual question answering and from78.82%to 84.96%on visual reasoning tasks across representative vision-language benchmarks.These findings demonstrate AMA’s superior attack efficiency and reveal the vulnerability of current visual language models to carefully crafted adversarial examples,underscoring the need to enhance their robustness.
基金supported by the Glocal University 30 Project Fund of Gyeongsang National University in 2025.
文摘Adversarial attacks pose a significant threat to artificial intelligence systems by exposing them to vulnerabilities in deep learning models.Existing defense mechanisms often suffer drawbacks,such as the need for model retraining,significant inference time overhead,and limited effectiveness against specific attack types.Achieving perfect defense against adversarial attacks remains elusive,emphasizing the importance of mitigation strategies.In this study,we propose a defense mechanism that applies random cropping and Gaussian filtering to input images to mitigate the impact of adversarial attacks.First,the image was randomly cropped to vary its dimensions and then placed at the center of a fixed 299299 space,with the remaining areas filled with zero padding.Subsequently,Gaussian×filtering with a 77 kernel and a standard deviation of two was applied using a convolution operation.Finally,the×smoothed image was fed into the classification model.The proposed defense method consistently appeared in the upperright region across all attack scenarios,demonstrating its ability to preserve classification performance on clean images while significantly mitigating adversarial attacks.This visualization confirms that the proposed method is effective and reliable for defending against adversarial perturbations.Moreover,the proposed method incurs minimal computational overhead,making it suitable for real-time applications.Furthermore,owing to its model-agnostic nature,the proposed method can be easily incorporated into various neural network architectures,serving as a fundamental module for adversarial defense strategies.