摘要
基于数据驱动的深度学习模型由于无法覆盖所有可能样本数据,导致面临着精心设计的对抗样本的攻击问题。现有主流的基于RGB像素值的L_(p)范数扰动攻击方法虽然达到了很好的攻击成功率和迁移性,但是所生成的对抗样本存在极易被人眼感知的高频噪声,而基于扩散模型的攻击方法兼顾了迁移性和不可察觉性,但是其优化策略主要从对抗模型的角度展开,缺乏从代理模型的角度对可迁移性和不可察觉性的深入探讨和分析。为了进一步探索分析可迁移性和不可察觉性的控制来源,以基于代理模型的攻击方法为框架,提出了一种新的基于潜在扩散模型的对抗样本生成方法。该方法中,在基本的对抗损失约束条件下,设计了可迁移注意力约束损失和不可察觉一致性约束损失,实现了对可迁移性与不可察觉性的平衡。在ImageNet-Compatible,CUB-200-2011和Stanford Cars这3个公开数据集上,与已有方法相比,所提方法生成的对抗样本具有很强的跨模型迁移攻击能力和人眼不易觉察扰动的效果。
Data-driven deep learning models face the problem of well-designed adversarial attacks due to their inability to cover all possible sample data.The existing main L_(p)-norm perturbation attack methods based on RGB pixel space have achieved great attack success rates and transferability,but the generated adversarial samples have high-frequency noise that is easily perceived by the human eye.The attack methods based on diffusion models balance transferability and imperceptibility,but their optimization strategies mainly focus on the perspective of adversarial models.Those researches lack deep exploration and analysis of transferability and imperceptibility from the perspective of surrogate model.In order to further explore and analyze the control sources of transferability and imperceptibility,a new adversarial sample generation method based on latent diffusion model is proposed within the framework of an attack method based on surrogate model.In this method,under the constraint of basic adversarial loss,transferable attention constraint loss and imperceptible consistency constraint loss are designed to achieve a balance between transferability and imperceptibility.On three publicly available datasets,ImageNet Compatible,CUB-200-2011,and Stanford Cars,compared with existing methods,the proposed method generates adversarial samples with strong cross-model transferable attack ability and the effect of imperceptible disturbance to the human eye.
作者
康凯
王家宝
徐堃
KANG Kai;WANG Jiabao;XU Kun(College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China)
出处
《计算机科学》
北大核心
2025年第6期381-389,共9页
Computer Science
基金
江苏省自然科学基金(BK20200581)。
关键词
对抗攻击
扩散模型
可迁移性
不可察觉性
注意力机制
Adversarial attacks
Diffusion model
Transferability
Imperceptibility
Attention mechanism