无人机视角多源目标检测数据集UAV-RGBT及算法基准被引量：1

UAV-RGBT Multispectral Object Detection Dataset and Algorithm Benchmark

下载PDF

导出

摘要基于无人机(Unmanned Aerial Vehicle,UAV)平台的可见光(Red Green Blue,RGB)和热红外(Thermal infrared,T)多源目标检测,可实现全天时、全天候的目标侦察,在军用和民用领域有着重要的应用价值.受限于数据拍摄获取和处理的复杂性,当前少有公开的UAV视角RGB-T多源目标检测数据集,一定程度上限制了UAV视角RGB-T多源目标检测算法的研究和应用.与此同时,UAV应用场景复杂多变,其飞行高度、速度、焦距和背景等快速变化,所拍摄目标在图像上呈现出尺度多样、稠密/稀疏分布不均衡、类别不平衡等特点,具有一定的挑战性.此外,在诸如目标侦察、交通监控等高时效性应用场景中,算法需在保证高精度的同时实现实时目标检测,因此,算法的设计必须充分考虑精度与速度之间的平衡.针对上述问题,本文构建了一个跨季节、跨昼夜、多类别、多尺度的大规模UAV视角RGB-T多源图像数据集UAV-RGBT,包含20个类别、5117对RGB-T图像和超11万个标注,有助于推进UAV视角多源目标检测算法的研究.同时,基于YOLOv8n模型,本文提出了一种UAV视角多源目标检测(UAV-based Dualbranch Multispectral object Detection,UAV-DMDet)模型,其通过多源交叉注意力融合和多源特征分解组合方法有效促进了多源特征的深度融合,较好地实现了模型参数量、检测速度和检测精度的均衡.实验结果表明:在UAVRGBT数据集上,UAV-DMDet模型较单源YOLOv8n模型,在RGB和T模态方面,mAP@0.5分别提高了3.61%、11.03%,mAP@0.5:0.95分别提高了0.84%、6.76%;在DroneVehicle数据集上,mAP@0.5和mAP@0.5:0.95较主流算法I2MDet提高了2.66%和12.36%;在检测速度方面,以640×640分辨率图像为例,UAV-DMDet模型在单张GeForce RTX 3090显卡上FP32精度推理速度可达31帧/s,在华为昇腾710处理器上FP16精度推理速度可达58帧/s,可有效应用于UAV视角RGB-T多源实时目标检测任务. Unmanned aerial vehicle(UAV)-based multispectral object detection utilizing both visible(RGB)and thermal infrared(T)images,makes all-weather and all-day target monitoring possible,serving critical roles in military and civilian applications.However,due to the complexity of data acquisition and processing,there is currently a lack of publicly available UAV-based RGB-T multispectral object detection datasets,which to some extent limits its research and application.Meanwhile,UAV operational scenarios are characterized by complex and variable conditions,including rapid changes in flight altitude,speed,focal length,and background.So,the captured targets exhibit diverse scales,uneven(dense/sparse)distributions,and category imbalances in images,which presents significant challenges for accurate detection.Furthermore,real-time requirement should be guaranted in applications such as reconnaissance and traffic monitoring.Therefore,it is the key to keep a trade-off between accuracy and speed in the algorithmic design of UAV RGB-T object detector.To address these issues,this paper introduces a large-scale UAV-based RGB-T multispectral dataset named UAV-RGBT,which spans across seasons and day-night cycles,and includes multiple categories and scales.Specifically,UAV-RGBT comprises 20 categories with 5117 pairs of RGB-T images and over 110000 annotations,which is conducive to advancing research in UAV-based multispectral object detection algorithms.Moreover,based on the YOLOv8n model,the UAV-based dualbranch multispectral object detection(UAV-DMDet)model is proposed to promote deep fusion of multispectral features through a multi-modal cross-attention fusion module and a multi-modal feature decomposition combination module.This approach achieves a batter trade-off among model parameter size,detection speed,and accuracy.Experimental results demonstrate that the UAV-DMDet model improves the mAP@0.5 on the UAV-RGBT dataset by 3.61%and 11.03%in the visible and thermal modalities,respectively,and enhances the mAP@0.5:0.95 by 0.84%and 6.76%,respectively.On the DroneVehicle dataset,the UAV-DMDet model outperforms the mainstream algorithm I2MDet,with mAP@0.5 and mAP@0.5:0.95 improvements of 2.66%and 12.36%,respectively.Furthermore,with 640´640 resolution images as input,the UAVDMDet model achieve FP32 precision inference speed of 31 frames per second on a GeForce RTX 3090 GPU,and FP16 precision inference speed of 58 frames per second on a Huawei Ascend 710 processor,making it effectively applicable for real-time UAV-based RGB-T multispectral object detection tasks.

作者汪进中戴顺张秀伟田雪涛邢颖慧汪芳尹翰林张艳宁 WANG Jin-zhong;DAI Shun;ZHANG Xiu-wei;TIAN Xue-tao;XING Yin-hui;WANG Fang;YIN Han-lin;ZHANG Yan-ning(School of Computer Science,Northwestern Polytechnical University,Xi’an,Shaanxi 710072,China;Xi’an ASN Technology Group Co.,Ltd.,Xi’an,Shaanxi 710065,China;Shenzhen Research Institute,Northwestern Polytechnical University,Shenzhen,Guangdong 518063,China)

机构地区西北工业大学计算机学院西安爱生技术集团有限公司西北工业大学深圳研究院

出处《电子学报》北大核心 2025年第3期686-704,共19页 Acta Electronica Sinica

基金国家自然科学基金(No.61971356) 陕西省自然科学基础研究计划(No.2024JC-DXWT-07,No.2024JCYBQN-0719) 陕西省重点研发计划(No.2023-YBGY-012) 广东省基础与应用基础研究基金(No.2024A1515030186)。

关键词无人机(UAV) 可见光-热红外(RGB-T)多源目标检测数据集多源特征融合 YOLOv8 unmanned aerial vehicle(UAV) visible and thermal infrared multispectral object detection dataset multi-modal feature fusion YOLOv8

分类号 TP389.1 [自动化与计算机技术—计算机系统结构] TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献12

1刘健,张祥甫,于志军,吴中红.基于改进ERFNet的无人直升机着舰环境语义分割[J].电讯技术,2020,60(1):40-46. 被引量：3
2FAN Bangkui,LI Yun,ZHANG Ruiyu,FU Qiqi.Review on the Technological Development and Application of UAV Systems[J].Chinese Journal of Electronics,2020,29(2):199-207. 被引量：66
3吴鹏飞,石章松,黄隽,傅冰.基于改进SSD网络的着舰标志识别方法[J].电光与控制,2022,29(1):88-92. 被引量：5
4王忠言,李波,刘茹艳,袁泽慧.基于yolov4算法的无人机单目测距算法[J].机械设计与制造工程,2022,51(3):58-62. 被引量：3
5钟映春,张文祥,王波,黄鹤儿,何惠清.电力巡检无人机自主降落的引导系统与策略[J].光学精密工程,2022,30(11):1362-1373. 被引量：9
6马宁,曹云峰,王指辉,翁祥瑞,吴林滨.基于YOLOv5网络架构的着陆跑道检测算法研究[J].激光与光电子学进展,2022,59(14):189-195. 被引量：11
7赵良玉,李丹,赵辰悦,蒋飞.无人机自主降落标识检测方法若干研究进展[J].航空学报,2022,43(9):266-281. 被引量：15
8钟春来,杨洋,曹立佳,王喆.基于视觉的无人机自主着陆研究综述[J].航空兵器,2023,30(5):104-114. 被引量：9
9马宁,曹云峰.面向无人机自主着陆的视觉感知与位姿估计方法综述[J].自动化学报,2024,50(7):1284-1304. 被引量：7
10王巍,解慧,魏忠诚,赵继军,彭力.不确定需求下无人机任务分配的两阶段鲁棒优化方法[J].电子学报,2024,52(10):3552-3561. 被引量：5

引证文献1

1王中天,吴一全.基于视觉与深度学习的无人机自主着陆场景感知方法研究进展[J].电子学报,2025,53(11):4171-4198.

1OPPO正式发布K系列全新产品OPPO K3[J].中国名牌,2019,0(6):96-96.
2蔡雅婷(图/文).美颜新作怎么拍都自然! 小米8 SE体验评测[J].消费电子,2018(8):90-93.
3新闻[J].计算机应用文摘,2019,0(14):6-8.
4黄贞璇(图/文).此千元机“难能不可贵” 荣耀8X、8X MAX手机体验评测[J].消费电子,2018,0(10):72-75.
5薛董敏.基于视觉显著性的海上远景目标检测系统[J].舰船科学技术,2021(24):184-186. 被引量：1
6陈维常,肖刚,杜琳琳,胡健伟.基于智能赋能点的无人机自主性评估科目设置[J].国防科技,2025,46(3):119-125.
7Han Yin,Wen Zhang,Jia Wang,Junqi Chen,Chen Cao,Qi Sun,Tengyue Li,Bo Han.UAV-Based Thermal Infrared Imaging Technology:A Novel Approach for Rapid Investigation of High-Steep Slopes[J].Journal of Earth Science,2025,36(3):1327-1333.
8杨克义,赵康迪,李杲阳,李康丽,蒋长帅,刘楠.适配RK3588的YOLOv5改进方法[J].物联网技术,2025,15(13):119-121.
9直击南部战区陆军侦察兵渗透训练[J].世界军事,2024(17).
10文杰,唐北曦,林师宾.γ射线计量实验室辐射场的测定[J].工业计量,2025,35(3):72-75.

电子学报

2025年第3期

浏览历史

内容加载中请稍等...

无人机视角多源目标检测数据集UAV-RGBT及算法基准被引量：1

同被引文献12

引证文献1

相关作者

相关机构

相关主题

浏览历史

无人机视角多源目标检测数据集UAV-RGBT及算法基准 被引量：1

同被引文献12

引证文献1

相关作者

相关机构

相关主题

浏览历史

无人机视角多源目标检测数据集UAV-RGBT及算法基准被引量：1