Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
现有的烟火检测方法主要依赖员工现场巡视,效率低且实时性差,因此,提出一种基于YOLOv5s的复杂场景下的高效烟火检测算法YOLOv5s-MRD(YOLOv5s-MPDIoU-RevCol-Dyhead)。首先,采用MPDIoU(Maximized Position-Dependent Intersection over U...现有的烟火检测方法主要依赖员工现场巡视,效率低且实时性差,因此,提出一种基于YOLOv5s的复杂场景下的高效烟火检测算法YOLOv5s-MRD(YOLOv5s-MPDIoU-RevCol-Dyhead)。首先,采用MPDIoU(Maximized Position-Dependent Intersection over Union)方法改进边框损失函数,以适应重叠或非重叠的边界框回归(BBR),从而提高BBR的准确性和效率;其次,利用可逆柱状结构RevCol(Reversible Column)网络模型思想重构YOLOv5s模型的主干网络,使它具有多柱状网络架构,并在模型的不同层之间加入可逆链接,从而最大限度地保持特征信息以提高网络的特征提取能力;最后,引入Dynamic head检测头,以统一尺度感知、空间感知和任务感知,从而在不额外增加计算开销的条件下显著提高目标检测头的准确性和有效性。实验结果表明:在DFS(Data of Fire and Smoke)数据集上,与原始YOLOv5s算法相比,所提算法的平均精度均值(mAP@0.5)提升了9.3%,预测准确率提升了6.6%,召回率提升了13.8%。可见,所提算法能满足当前烟火检测应用场景的要求。展开更多
Tea,a globally cultivated crop renowned for its uniqueflavor profile and health-promoting properties,ranks among the most favored functional beverages worldwide.However,diseases severely jeopardize the production and qu...Tea,a globally cultivated crop renowned for its uniqueflavor profile and health-promoting properties,ranks among the most favored functional beverages worldwide.However,diseases severely jeopardize the production and quality of tea leaves,leading to significant economic losses.While early and accurate identification coupled with the removal of infected leaves can mitigate widespread infection,manual leaves removal remains time-con-suming and expensive.Utilizing robots for pruning can significantly enhance efficiency and reduce costs.How-ever,the accuracy of object detection directly impacts the overall efficiency of pruning robots.In complex tea plantation environments,complex image backgrounds,the overlapping and occlusion of leaves,as well as small and densely harmful leaves can all introduce interference factors.Existing algorithms perform poorly in detecting small and densely packed targets.To address these challenges,this paper collected a dataset of 1108 images of harmful tea leaves and proposed the YOLO-DBD model.The model excels in efficiently identifying harmful tea leaves with various poses in complex backgrounds,providing crucial guidance for the posture and obstacle avoidance of a robotic arm during the pruning process.The improvements proposed in this study encompass the Cross Stage Partial with Deformable Convolutional Networks v2(C2f-DCN)module,Bi-Level Routing Atten-tion(BRA),Dynamic Head(DyHead),and Focal Complete Intersection over Union(Focal-CIoU)Loss function,enhancing the model’s feature extraction,computation allocation,and perception capabilities.Compared to the baseline model YOLOv8s,mean Average Precision at IoU 0.5(mAP0.5)increased by 6%,and Floating Point Operations Per second(FLOPs)decreased by 3.3 G.展开更多
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
文摘现有的烟火检测方法主要依赖员工现场巡视,效率低且实时性差,因此,提出一种基于YOLOv5s的复杂场景下的高效烟火检测算法YOLOv5s-MRD(YOLOv5s-MPDIoU-RevCol-Dyhead)。首先,采用MPDIoU(Maximized Position-Dependent Intersection over Union)方法改进边框损失函数,以适应重叠或非重叠的边界框回归(BBR),从而提高BBR的准确性和效率;其次,利用可逆柱状结构RevCol(Reversible Column)网络模型思想重构YOLOv5s模型的主干网络,使它具有多柱状网络架构,并在模型的不同层之间加入可逆链接,从而最大限度地保持特征信息以提高网络的特征提取能力;最后,引入Dynamic head检测头,以统一尺度感知、空间感知和任务感知,从而在不额外增加计算开销的条件下显著提高目标检测头的准确性和有效性。实验结果表明:在DFS(Data of Fire and Smoke)数据集上,与原始YOLOv5s算法相比,所提算法的平均精度均值(mAP@0.5)提升了9.3%,预测准确率提升了6.6%,召回率提升了13.8%。可见,所提算法能满足当前烟火检测应用场景的要求。
文摘Tea,a globally cultivated crop renowned for its uniqueflavor profile and health-promoting properties,ranks among the most favored functional beverages worldwide.However,diseases severely jeopardize the production and quality of tea leaves,leading to significant economic losses.While early and accurate identification coupled with the removal of infected leaves can mitigate widespread infection,manual leaves removal remains time-con-suming and expensive.Utilizing robots for pruning can significantly enhance efficiency and reduce costs.How-ever,the accuracy of object detection directly impacts the overall efficiency of pruning robots.In complex tea plantation environments,complex image backgrounds,the overlapping and occlusion of leaves,as well as small and densely harmful leaves can all introduce interference factors.Existing algorithms perform poorly in detecting small and densely packed targets.To address these challenges,this paper collected a dataset of 1108 images of harmful tea leaves and proposed the YOLO-DBD model.The model excels in efficiently identifying harmful tea leaves with various poses in complex backgrounds,providing crucial guidance for the posture and obstacle avoidance of a robotic arm during the pruning process.The improvements proposed in this study encompass the Cross Stage Partial with Deformable Convolutional Networks v2(C2f-DCN)module,Bi-Level Routing Atten-tion(BRA),Dynamic Head(DyHead),and Focal Complete Intersection over Union(Focal-CIoU)Loss function,enhancing the model’s feature extraction,computation allocation,and perception capabilities.Compared to the baseline model YOLOv8s,mean Average Precision at IoU 0.5(mAP0.5)increased by 6%,and Floating Point Operations Per second(FLOPs)decreased by 3.3 G.