摘要
【目的】针对无人机(UAV)航拍图像中复杂背景下多尺度目标检测挑战,文章提出了一种基于YOLOv10模型改进的轻量化检测模型,以提升UAV在复杂背景下多尺度目标的检测性能。【方法】在YOLOv10模型的基础上,文章首先引入了多通道逆残差模块(MCIR),该模块通过利用多通道处理策略与逆残差(IR)结构,提升了模型在复杂背景中对多尺度目标的特征提取和融合能力;其次,文章在跨阶段部分(CSP)双卷积瓶颈层(C2f)基础上引入了卷积块注意力模块(CBAM)注意力机制得到C2f-CM,增强了复杂背景中对特征目标的定位能力;随后,文章基于MCIR和C2f-CM对YOLOv10模型的骨干网络(Backbone)和颈部网络(Neck)进行了轻量化处理,主要是在Backbone中减少下采样次数以保留更多的特征信息,在Neck中通过优化上采样和特征拼接操作,减少网络层数和复杂度,进一步降低计算开销;最后,文章将原损失函数优化为FocalEIOU,能够更精准地匹配预测框和真实框,并有效解决数据集中样本不平衡问题。【结果】实验结果表明,改进后的轻量化目标检测(LED)-YOLOv10模型相较于原始YOLOv10模型,在mAP50(平均精度)上提升了9.8%,达到了44.5%;参数量和模型大小分别减少了65.68%和51.60%,降至0.927 MB和2.700 MB。消融实验进一步验证了改进模块MCIR、C2f-CM和Focal-EIOU的有效性,显示其在提升检测精度的同时,显著降低了模型的复杂度。对比实验结果表明,LED-YOLOv10在复杂背景下的多尺度场景中综合表现最优,检测精度和模型大小均优于其他目标检测算法,检测速度也能够满足实时检测的需求。在嵌入式设备Jetson Nano中对该轻量级模型的综合性能评估也进一步表明,LED-YOLOv10更适合在资源受限的嵌入式平台部署使用。【结论】文章提出的LED-YOLOv10模型,显著提升了UAV航拍图像中复杂背景下多尺度目标检测的精度,并大幅度减少了模型参数量和大小。实验结果和检测效果也验证了文章所提方法在复杂背景下多尺度目标检测中的优越性,为UAV目标检测提供了一种高效、轻量的解决方案。
【Objective】To address the challenges of multi-scale object detection in complex backgrounds of Unmanned Aerial Vehicle(UAV)aerial images,this paper proposes a lightweight detection model based on an improved YOLOv10 to enhance the detection performance of Unmanned Aerial Vehicle(UAV)s in complex backgrounds.【Methods】Based on YOLOv10,the Multichannel Inverted Residual Block(MCIR)was introduced first.This module uses a multichannel processing strategy and an Inverted Residual(IR)structure to enhance the model's feature extraction and fusion capabilities for multi-scale objects in complex backgrounds.Second,the Convolutional Block Attention Module(CBAM)was introduced into the Cross Stage Partial(CSP)Bottle-neck with 2 Convolutions(C2f)module to create the C2f-CM,which improves the localization of feature objects in complex backgrounds.Then,the backbone and neck networks of YOLOv10 were lightweighted based on MCIR and C2f-CM.This was achieved by reducing the number of downsampling operations in the backbone network to retain more feature information,and by optimizing the upsampling and feature concatenation operations in the neck network to reduce the number of layers and complexity,thereby further reducing the computational overhead.Finally,the original loss function was optimized to Focal-EIoU,which can more accurately match the predicted and real boxes and effectively address the issue of sample imbalance in the dataset.【Results】Experimental results show that the improved Lightweight Enhanced Detection(LED)-YOLOv10 model achieved a 9.8 percentage point increase in mAP50(average precision)compared to the original YOLOv10,reaching 44.5%.The parameter count and model size were reduced by 65.68%and 51.60%,respectively,to 0.927 and 2.700 MB.Ablation experiments further validated the effectiveness of the improved MCIR,C2f-CM,and Focal-EIoU modules,demonstrating significant improvements in detection accuracy while substantially reducing model complexity.Comparative experiments show that LED-YOLOv10 performs optimally in multi-scale scenarios with complex backgrounds,with superior detection accuracy and model size compared to other object detection algorithms.Its detection speed can meet the requirements for real-time detection.The comprehensive performance evaluation of this lightweight model on the embedded device Jetson Nano further indicates that LED-YOLOv10 is more suitable for deployment on resource-constrained embedded platforms.【Conclusion】The proposed LED-YOLOv10 model significantly improves the accuracy of multi-scale object detection in complex backgrounds of UAV aerial images while greatly reducing the model's parameter count and size.The experimental results and detection performance validate the superiority of the proposed method in multi-scale object detection in complex backgrounds,providing an efficient and lightweight solution for UAV object detection.
作者
黄毅
周纯
刘欣军
陈庆
HUANG Yi;ZHOU Chun;LIU Xinjun;CHEN Qing(Department of Information Engineering,Guangzhou Modern Information Engineering College,Guangzhou 510000,China;Shanghai Huaxun Network System Co.,Ltd.,Shanghai 201103,China;School of Computer and Electronic Information,Guangxi University,Nanning 530004,China)
出处
《光通信研究》
北大核心
2025年第5期41-48,共8页
Study on Optical Communications
基金
国家自然科学基金资助项目(62003104)。