Detecting oriented targets in remote sensing images amidst complex and heterogeneous backgrounds remains a formidable challenge in the field of object detection.Current frameworks for oriented detection modules are co...Detecting oriented targets in remote sensing images amidst complex and heterogeneous backgrounds remains a formidable challenge in the field of object detection.Current frameworks for oriented detection modules are constrained by intrinsic limitations,including excessive computational and memory overheads,discrepancies between predefined anchors and ground truth bounding boxes,intricate training processes,and feature alignment inconsistencies.To overcome these challenges,we present ASL-OOD(Angle-based SIOU Loss for Oriented Object Detection),a novel,efficient,and robust one-stage framework tailored for oriented object detection.The ASL-OOD framework comprises three core components:the Transformer-based Backbone(TB),the Transformer-based Neck(TN),and the Angle-SIOU(Scylla Intersection over Union)based Decoupled Head(ASDH).By leveraging the Swin Transformer,the TB and TN modules offer several key advantages,such as the capacity to model long-range dependencies,preserve high-resolution feature representations,seamlessly integrate multi-scale features,and enhance parameter efficiency.These improvements empower the model to accurately detect objects across varying scales.The ASDH module further enhances detection performance by incorporating angle-aware optimization based on SIOU,ensuring precise angular consistency and bounding box coherence.This approach effectively harmonizes shape loss and distance loss during the optimization process,thereby significantly boosting detection accuracy.Comprehensive evaluations and ablation studies on standard benchmark datasets such as DOTA with an mAP(mean Average Precision)of 80.16 percent,HRSC2016 with an mAP of 91.07 percent,MAR20 with an mAP of 85.45 percent,and UAVDT with an mAP of 39.7 percent demonstrate the clear superiority of ASL-OOD over state-of-the-art oriented object detection models.These findings underscore the model’s efficacy as an advanced solution for challenging remote sensing object detection tasks.展开更多
Transmission towers play a crucial role in overhead transmission line systems and are the key target of transmission line inspections.With the help of remote sensing technology,transmission towers can be effectively d...Transmission towers play a crucial role in overhead transmission line systems and are the key target of transmission line inspections.With the help of remote sensing technology,transmission towers can be effectively detected in wide areas at reasonable costs and in a relatively short time period.However,it is difficult to identify the type of transmission towers in optical remote sensing images due to detail degradation caused by long-distance and high-altitude imaging.This paper proposes a transmission tower detection method in optical remote sensing images using an oriented object detector and object and shadow joint detection.To enrich the information,the transmission towers and their shadows are jointly detected through a CenterNet detector with an orientation prediction branch.To improve the detection accuracy of difficult objects,attention and deformable convolutional network modules are introduced to the backbone and orientation prediction branches,respectively.Considering the orientation and the aspect ratio of the objects and shadows,a focal loss function with an aspect ratio is employed to further improve the accuracy.Object and shadow joint detection are separately realized through the one-box and multi-box detection strategies.A transmission tower dataset RSITT labeled with horizontal and oriented boxes is established.Experiments conducted on the RSITT dataset have demonstrated that the detection accuracy and recall rate of the proposed joint detection algorithm reached 73.2%and 95.2%.展开更多
In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have differ...In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.展开更多
Rail positioning is a critical step for detecting rail defects downstream.However,existing orientation-based detectors struggle to effectively manage rails with arbitrary inclinations and high aspect ratios,particular...Rail positioning is a critical step for detecting rail defects downstream.However,existing orientation-based detectors struggle to effectively manage rails with arbitrary inclinations and high aspect ratios,particularly in turnout sections.To address these challenges,a fuzzy boundary guidance and oriented Gaussian function-based anchor-free network termed the rail positioning network(RP-Net)is proposed for rail positioning in turnout sections.First,an oriented Gaussian function-based label generation strategy is introduced.This strategy produces smoother and more accu-rate label values by accounting for the specific aspect ratios and orientations of the rails.Second,a fuzzy boundary learning module is developed to enhance the network’s abil-ity to model the rail boundary regions effectively.Further-more,a boundary guidance module is developed to direct the network in fusing the features obtained from the downs-ampled network output with the boundary region features,which have been enhanced to contain more refined posi-tional and structural information.A local channel attention mechanism is integrated into this module to identify critical channels.Finally,experiments conducted on the tracking dataset show that the proposed RP-Net achieves high posi-tioning accuracy and demonstrates strong adaptability in complex scenarios.展开更多
Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either ...Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either rely on rotated anchoring schemes or adding complex rotating ROI transfer layers,leading to increased computational demand and reduced detection speeds.In this study,we propose a novel internal-external optimized convolutional neural network for arbitrary orientated object detection in optical remote sensing images.For the internal opti-mization,we designed an anchor-based single-shot head detector that adopts the concept of coarse-to-fine detection for two-stage object detection networks.The refined rotating anchors are generated from the coarse detection head module and fed into the refining detection head module with a link of an embedded deformable convolutional layer.For the external optimiza-tion,we propose an IOU balanced loss that addresses the regression challenges related to arbitrary orientated bounding boxes.Experimental results on the DOTA and HRSC2016 bench-mark datasets show that our proposed method outperforms selected methods.展开更多
基金supported by the Key Research and Development Program of Shaanxi Province(2024GX-YBXM-010).
文摘Detecting oriented targets in remote sensing images amidst complex and heterogeneous backgrounds remains a formidable challenge in the field of object detection.Current frameworks for oriented detection modules are constrained by intrinsic limitations,including excessive computational and memory overheads,discrepancies between predefined anchors and ground truth bounding boxes,intricate training processes,and feature alignment inconsistencies.To overcome these challenges,we present ASL-OOD(Angle-based SIOU Loss for Oriented Object Detection),a novel,efficient,and robust one-stage framework tailored for oriented object detection.The ASL-OOD framework comprises three core components:the Transformer-based Backbone(TB),the Transformer-based Neck(TN),and the Angle-SIOU(Scylla Intersection over Union)based Decoupled Head(ASDH).By leveraging the Swin Transformer,the TB and TN modules offer several key advantages,such as the capacity to model long-range dependencies,preserve high-resolution feature representations,seamlessly integrate multi-scale features,and enhance parameter efficiency.These improvements empower the model to accurately detect objects across varying scales.The ASDH module further enhances detection performance by incorporating angle-aware optimization based on SIOU,ensuring precise angular consistency and bounding box coherence.This approach effectively harmonizes shape loss and distance loss during the optimization process,thereby significantly boosting detection accuracy.Comprehensive evaluations and ablation studies on standard benchmark datasets such as DOTA with an mAP(mean Average Precision)of 80.16 percent,HRSC2016 with an mAP of 91.07 percent,MAR20 with an mAP of 85.45 percent,and UAVDT with an mAP of 39.7 percent demonstrate the clear superiority of ASL-OOD over state-of-the-art oriented object detection models.These findings underscore the model’s efficacy as an advanced solution for challenging remote sensing object detection tasks.
基金supported by the National Key R&D Program of China(2020YFB0905900).
文摘Transmission towers play a crucial role in overhead transmission line systems and are the key target of transmission line inspections.With the help of remote sensing technology,transmission towers can be effectively detected in wide areas at reasonable costs and in a relatively short time period.However,it is difficult to identify the type of transmission towers in optical remote sensing images due to detail degradation caused by long-distance and high-altitude imaging.This paper proposes a transmission tower detection method in optical remote sensing images using an oriented object detector and object and shadow joint detection.To enrich the information,the transmission towers and their shadows are jointly detected through a CenterNet detector with an orientation prediction branch.To improve the detection accuracy of difficult objects,attention and deformable convolutional network modules are introduced to the backbone and orientation prediction branches,respectively.Considering the orientation and the aspect ratio of the objects and shadows,a focal loss function with an aspect ratio is employed to further improve the accuracy.Object and shadow joint detection are separately realized through the one-box and multi-box detection strategies.A transmission tower dataset RSITT labeled with horizontal and oriented boxes is established.Experiments conducted on the RSITT dataset have demonstrated that the detection accuracy and recall rate of the proposed joint detection algorithm reached 73.2%and 95.2%.
文摘In the study of oriented bounding boxes(OBB)object detection in high-resolution remote sensing images,the problem of missed and wrong detection of small targets occurs because the targets are too small and have different orientations.Existing OBB object detection for remote sensing images,although making good progress,mainly focuses on directional modeling,while less consideration is given to the size of the object as well as the problem of missed detection.In this study,a method based on improved YOLOv8 was proposed for detecting oriented objects in remote sensing images,which can improve the detection precision of oriented objects in remote sensing images.Firstly,the ResCBAMG module was innovatively designed,which could better extract channel and spatial correlation information.Secondly,the innovative top-down feature fusion layer network structure was proposed in conjunction with the Efficient Channel Attention(ECA)attention module,which helped to capture inter-local cross-channel interaction information appropriately.Finally,we introduced an innovative ResCBAMG module between the different C2f modules and detection heads of the bottom-up feature fusion layer.This innovative structure helped the model to better focus on the target area.The precision and robustness of oriented target detection were also improved.Experimental results on the DOTA-v1.5 dataset showed that the detection Precision,mAP@0.5,and mAP@0.5:0.95 metrics of the improved model are better compared to the original model.This improvement is effective in detecting small targets and complex scenes.
基金Major Scientific Research Projects of China Railway Group(No.K2019G046)the National Key Research and Devel-opment Program of China(No.2020YFB1600700).
文摘Rail positioning is a critical step for detecting rail defects downstream.However,existing orientation-based detectors struggle to effectively manage rails with arbitrary inclinations and high aspect ratios,particularly in turnout sections.To address these challenges,a fuzzy boundary guidance and oriented Gaussian function-based anchor-free network termed the rail positioning network(RP-Net)is proposed for rail positioning in turnout sections.First,an oriented Gaussian function-based label generation strategy is introduced.This strategy produces smoother and more accu-rate label values by accounting for the specific aspect ratios and orientations of the rails.Second,a fuzzy boundary learning module is developed to enhance the network’s abil-ity to model the rail boundary regions effectively.Further-more,a boundary guidance module is developed to direct the network in fusing the features obtained from the downs-ampled network output with the boundary region features,which have been enhanced to contain more refined posi-tional and structural information.A local channel attention mechanism is integrated into this module to identify critical channels.Finally,experiments conducted on the tracking dataset show that the proposed RP-Net achieves high posi-tioning accuracy and demonstrates strong adaptability in complex scenarios.
基金This work is supported by the National Natural Science Foundation of China[grant numbers 41890820,41771452,41771454,and 41901340]。
文摘Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either rely on rotated anchoring schemes or adding complex rotating ROI transfer layers,leading to increased computational demand and reduced detection speeds.In this study,we propose a novel internal-external optimized convolutional neural network for arbitrary orientated object detection in optical remote sensing images.For the internal opti-mization,we designed an anchor-based single-shot head detector that adopts the concept of coarse-to-fine detection for two-stage object detection networks.The refined rotating anchors are generated from the coarse detection head module and fed into the refining detection head module with a link of an embedded deformable convolutional layer.For the external optimiza-tion,we propose an IOU balanced loss that addresses the regression challenges related to arbitrary orientated bounding boxes.Experimental results on the DOTA and HRSC2016 bench-mark datasets show that our proposed method outperforms selected methods.