为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分...为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分割模型(Grounded segment anything model,Grounded SAM)提取柚树组件,然后结合稳定扩散模型Stable Diffusion使用文本提示生成随机背景,最后使用改进的分形树算法生成柚树以提升多样性及真实感。试验采用YOLO v10轻量化版本进行验证,在自建的非结构化环境柚子目标检测数据集上,当训练集真实图像数量分别为0、8、16、32、64幅时,使用本文方法后模型多阈值平均精度均值(Mean average precision at intersection over union thresholds from 0.50 to 0.95,mAP50-95)提升率依次达到662.3%、24.9%、13.7%、8.8%、1.8%。当训练集中真实图像数量为221幅,生成图像数量为512幅时,模型达到最优性能:精确率为76.9%,召回率为62.7%,mAP50为70.3%,mAP50-95为38.4%。迁移到橙子目标检测任务,相同数据规模下的性能提升分别为212.9%、16.5%、14.0%、5.2%、4.1%。当训练集中真实图像数量为1302幅,生成图像数量为512幅时,模型同样达到最优性能:精确率为90.3%,召回率为87.8%,mAP50为94.0%,mAP50-95为54.0%。试验结果表明,该图像生成增强方法在零样本和少样本学习场景中能够有效扩展训练数据,提高YOLO v10轻量化版本目标检测的性能,并展现出良好的泛化能力。展开更多
Deep neural networks are commonly used in computer vision tasks,but they are vulnerable to adversarial samples,resulting in poor recognition accuracy.Although traditional algorithms that craft adversarial samples have...Deep neural networks are commonly used in computer vision tasks,but they are vulnerable to adversarial samples,resulting in poor recognition accuracy.Although traditional algorithms that craft adversarial samples have been effective in attacking classification models,the attacking performance degrades when facing object detection models with more complex structures.To address this issue better,in this paper we first analyze the mechanism of multi-scale feature extraction of object detection models,and then by constructing the object feature-wise attention module and the perturbation extraction module,a novel adversarial sample generation algorithm for attacking detection models is proposed.Specifically,in the first module,based on the multi-scale feature map,we reduce the range of perturbation and improve the stealthiness of adversarial samples by computing the noise distribution in the object region.Then in the second module,we feed the noise distribution into the generative adversarial networks to generate adversarial perturbation with strong attack transferability.By doing so,the proposed approach possesses the ability to better confuse the judgment of detection models.Experiments carried out on the DroneVehicle dataset show that our method is computationally efficient and works well in attacking detection models measured by qualitative analysis and quantitative analysis.展开更多
文摘为了降低柚子等水果目标检测对大量标注数据的依赖,本文提出了一种融合视觉语言模型的柚子分形树图像生成增强方法。该方法仅需3~5幅无标注真实图像,即可在无训练条件下生成大规模带标注的训练数据集。首先利用基于文本提示的零样本分割模型(Grounded segment anything model,Grounded SAM)提取柚树组件,然后结合稳定扩散模型Stable Diffusion使用文本提示生成随机背景,最后使用改进的分形树算法生成柚树以提升多样性及真实感。试验采用YOLO v10轻量化版本进行验证,在自建的非结构化环境柚子目标检测数据集上,当训练集真实图像数量分别为0、8、16、32、64幅时,使用本文方法后模型多阈值平均精度均值(Mean average precision at intersection over union thresholds from 0.50 to 0.95,mAP50-95)提升率依次达到662.3%、24.9%、13.7%、8.8%、1.8%。当训练集中真实图像数量为221幅,生成图像数量为512幅时,模型达到最优性能:精确率为76.9%,召回率为62.7%,mAP50为70.3%,mAP50-95为38.4%。迁移到橙子目标检测任务,相同数据规模下的性能提升分别为212.9%、16.5%、14.0%、5.2%、4.1%。当训练集中真实图像数量为1302幅,生成图像数量为512幅时,模型同样达到最优性能:精确率为90.3%,召回率为87.8%,mAP50为94.0%,mAP50-95为54.0%。试验结果表明,该图像生成增强方法在零样本和少样本学习场景中能够有效扩展训练数据,提高YOLO v10轻量化版本目标检测的性能,并展现出良好的泛化能力。
基金supported in part by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Nos.2023AH040149 and 2022AH050310)the Anhui Provincial Natural Science Foundation(No.2208085MF168)+1 种基金the Science and Technology Innovation Program of Maanshan,China(No.2021a120009)the National Natural Science Foundation of China(Nos.52205548,62206006,and 62306007).
文摘Deep neural networks are commonly used in computer vision tasks,but they are vulnerable to adversarial samples,resulting in poor recognition accuracy.Although traditional algorithms that craft adversarial samples have been effective in attacking classification models,the attacking performance degrades when facing object detection models with more complex structures.To address this issue better,in this paper we first analyze the mechanism of multi-scale feature extraction of object detection models,and then by constructing the object feature-wise attention module and the perturbation extraction module,a novel adversarial sample generation algorithm for attacking detection models is proposed.Specifically,in the first module,based on the multi-scale feature map,we reduce the range of perturbation and improve the stealthiness of adversarial samples by computing the noise distribution in the object region.Then in the second module,we feed the noise distribution into the generative adversarial networks to generate adversarial perturbation with strong attack transferability.By doing so,the proposed approach possesses the ability to better confuse the judgment of detection models.Experiments carried out on the DroneVehicle dataset show that our method is computationally efficient and works well in attacking detection models measured by qualitative analysis and quantitative analysis.