摘要
针对无人机航拍图像中目标尺度变化大、目标遮挡、模型参数量大等问题,提出了上下文特征增强-你只看一次第11版(context feature enhancement-you only look once version 11,CFE-YOLOv11)模型,用于航拍图像目标检测。首先,设计轻量化下采样卷积(lightweight downsampling convolution,LDC)模块,通过双路下采样增强特征信息,利用通道混洗提升跨通道信息交互能力,降低模型的参数量。其次,设计基于分离与增强注意力的3尺度卷积双路径可变核(convolutional three-scale kernel-adaptive dual-path separated and enhancement attention,C3k2-SEA)模块,提升模型的特征提取能力。同时,提出多分支空间与通道注意力(multi-branch spatial and channel attention,MSCA)模块,增强模型对不同尺度目标特征的提取能力。最后,使用双重损失优化(dual loss optimization,DLO)模块,通过差异化梯度增益分配机制,优化模型的检测效果。结果表明,相较于YOLOv11模型,CFE-YOLOv11模型在视觉无人机(visual drones,VisDrone)、遥感图像目标检测(remote sensing object detection,RSOD)数据集上,交并比阈值为0.50时的平均精确率均值分别提升了2.1、2.0个百分点,且模型参数量下降了15.4%。CFE-YOLOv11模型不仅提升了航拍图像检测精度,还缓解了模型漏检、误检情况,为准确检测航拍图像目标提供了高效的解决方案。
To address issues such as large target scale variations,target occlusion,and large model parameters in unmanned aerial vehicle(UAV)aerial images,the context feature enhancement-you only look once version 11(CFE-YOLOv11)model was proposed for aerial image target detection.Firstly,a lightweight downsampling convolution(LDC)module was designed,which enhanced feature information through dual-path downsampling and improved cross-channel information interaction using channel shuffling,thereby reducing the number of model parameters.Secondly,a convolutional three-scale kernel-adaptive dual-path separated and enhancement attention(C3k2-SEA)module based on separated and enhanced attention was designed to improve the feature extraction capability of the model.Meanwhile,multi-branch spatial and channel attention(MSCA)module was proposed to enhance the model′s ability to extract features of targets at different scales.Finally,dual loss optimization(DLO)module was used to optimize the detection effect of the model through a differential gradient gain allocation mechanism.The results showed that,compared with the YOLOv11 model,the CFE-YOLOv11 model achieved 2.1 and 2.0 percentage points,respectively,in mean average precision(mAP)at an intersection over union(IoU)threshold of 0.50 were achieved by the CFE-YOLOv11 model on the visual drones(VisDrone)and remote sensing object detection(RSOD)datasets,respectively,while the number of parameters was reduced by 15.4%.The CFE-YOLOv11 model not only improved the detection accuracy of aerial images but also alleviated the problems of model missed detection and false detection,providing an efficient solution for accurate detection of aerial image targets.
作者
顾成杰
彭俊铭
朱东郡
张俊军
郑亚兵
GU Chengjie;PENG Junming;ZHU Dongjun;ZHANG Junjun;ZHENG Yabing(School of Public Safety and Emergency Management,Anhui University of Science and Technology,Hefei 231100,China;School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan 232001,China)
出处
《湖北民族大学学报(自然科学版)》
2025年第4期497-503,共7页
Journal of Hubei Minzu University:Natural Science Edition
基金
国家自然科学基金重大科研仪器研制项目(52227901)
国家重点研发计划课题(2022YFB2901305)
科技部国家重点研发计划(长三角科技创新共同体联合攻关专项)基金项目(2023CSJGG1103)
安徽省高等学校科学研究项目基金(2023AH051197)
安徽理工大学引进人才科研启动基金项目(2023yjrc33)。
关键词
无人机
航拍图像
目标检测
特征提取
注意力机制
unmanned aerial vehicle
aerial images
object detection
feature extraction
attention mechanism