摘要
在进行图像对抗攻击时,针对目标模型进行的白盒攻击往往效果最佳,但实际中通常难以获取目标模型结构,这使得提高对抗样本的迁移性尤为关键。针对这一问题,提出一种基于生成对抗网络(generative adversarial network,GAN)的训练方法,用以生成具备强迁移性的对抗样本。研究发现,图像本身具有与模型无关的脆弱性,生成式方法正是通过挖掘这一特性进行攻击的。与传统方法在原图邻域内微调不同,该方法从其他类别分布中生成具有最大似然的图像,在视觉上接近真实图像,但能有效误导分类器。训练过程中,生成器生成对抗样本,判别器判断其标签的正确性,二者协同优化,不断提升样本的攻击性与真实度。实验表明,生成式对抗样本在多个模型上的攻击成功率显著高于传统方法,平均提升约25%,展现出更强的跨模型泛化能力。该结果表明生成式对抗攻击不仅提升了黑盒攻击的实用性,也揭示了深度模型普遍存在的脆弱性,为后续防御机制设计提供了方向。
In the context of image adversarial attacks,white-box attacks targeting the target model often yield the best performance.However,in practice,it is usually difficult to obtain the architecture of the target model,which makes improving the transferability of adversarial examples particularly crucial.To address this issue,a training method based on generative adversarial network(GAN)was proposed to generate adversarial examples with strong transferability.The study finds that images themselves possess model-agnostic vulnerabilities,and generative methods implement attacks precisely by exploiting this characteristic.Unlike traditional methods that perform fine-tuning within the neighborhood of the original image,this method generates images with maximum likelihood from the distribution of other categories.These images are visually close to real images but can effectively mislead classifiers.During the training process,the generator produces adversarial examples,while the discriminator judges the correctness of their labels.The two components optimize collaboratively,continuously enhancing the adversarial potency and authenticity of the examples.Experiments show that the attack success rate of generative adversarial examples on multiple models is significantly higher than that of traditional methods,with an average improvement of approximately 25%,demonstrating stronger cross-model generalization ability.This result indicates that generative adversarial attacks not only enhance the practicality of black-box attacks but also reveal the widespread vulnerabilities of deep models,providing directions for the design of subsequent defense mechanisms.
作者
张兆阳
孙芳慧
张明旭
宋伟
王振邦
王英琦
张可卿
王莘
ZHANG Zhaoyang;SUN Fanghui;ZHANG Mingxu;SONG Wei;WANG Zhenbang;WANG Yingqi;ZHANG Keqing;WANG Shen(School of Cybersecurity,Harbin Institute of Technology,Harbin 150001,China;China Electronics Society,Beijing 100036,China;China Mobile IoT Co.,Ltd.,Chongqing 401336,China;State Grid Heilongjiang Electric Power Co.,Ltd.,Harbin 150090,China)
出处
《信息对抗技术》
2025年第5期1-21,共21页
Information Countermeasure Technology
基金
国防基础科研项目(JCKY2023603C043)
黑龙江省重点研发计划项目(2022ZX01C01)
黑龙江省自然科学基金资助项目(LH2024F023)。
关键词
生成式对抗攻击
模型迁移性
黑盒攻击
generative adversarial attack
model transferability
black-box attack