In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convo...In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.展开更多
Detecting moving objects in the stationary background is an important problem in visual surveillance systems.However,the traditional background subtraction method fails when the background is not completely stationary...Detecting moving objects in the stationary background is an important problem in visual surveillance systems.However,the traditional background subtraction method fails when the background is not completely stationary and involves certain dynamic changes.In this paper,according to the basic steps of the background subtraction method,a novel non-parametric moving object detection method is proposed based on an improved ant colony algorithm by using the Markov random field.Concretely,the contributions are as follows:1)A new nonparametric strategy is utilized to model the background,based on an improved kernel density estimation;this approach uses an adaptive bandwidth,and the fused features combine the colours,gradients and positions.2)A Markov random field method based on this adaptive background model via the constraint of the spatial context is proposed to extract objects.3)The posterior function is maximized efficiently by using an improved ant colony system algorithm.Extensive experiments show that the proposed method demonstrates a better performance than many existing state-of-the-art methods.展开更多
Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version...Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version. The proposed method learns an optimal kernel combination and the associated classifier simultaneously. Two feature types are extracted from images, forming basis kernels. Then these basis kernels are weighted combined and resulted the composite kernel exploits interesting points and appearance information of objects simultaneously. Weights and the detection model are finally learnt by a new algorithm. Experimental results show that the proposed method improve detection accuracy to above 88%, yields good interpretation for the selected subset of features and appears sparser than traditional single-kernel RVMs.展开更多
The article deals with the experimental studies of atmosphere indistinct radiation structure. The information extraction background of dot size thermal object presence in atmosphere is reasonable. Indistinct generaliz...The article deals with the experimental studies of atmosphere indistinct radiation structure. The information extraction background of dot size thermal object presence in atmosphere is reasonable. Indistinct generalization of experimental study regularities technique of space-time irregularity radiation structure in infrared wave range is offered. The approach to dot size thermal object detection in atmosphere is proved with a help of threshold method in the thermodynamic and turbulent process conditions, based on the indistinct statement return task solution.展开更多
Most image-based object detection methods employ horizontal bounding boxes(HBBs)to capture objects in tunnel images.However,these bounding boxes often fail to effectively enclose objects oriented in arbitrary directio...Most image-based object detection methods employ horizontal bounding boxes(HBBs)to capture objects in tunnel images.However,these bounding boxes often fail to effectively enclose objects oriented in arbitrary directions,resulting in reduced accuracy and suboptimal detection performance.Moreover,HBBs cannot provide directional information for rotated objects.This study proposes a rotated detection method for identifying apparent defects in shield tunnels.Specifically,the oriented region-convolutional neural network(oriented R-CNN)is utilized to detect rotated objects in tunnel images.To enhance feature extraction,a novel hybrid backbone combining CNN-based networks with Swin Transformers is proposed.A feature fusion strategy is employed to integrate features extracted from both networks.Additionally,a neck network based on the bidirectional-feature pyramid network(Bi-FPN)is designed to combine multi-scale object features.The bolt hole dataset is curated to evaluate the efficacyof the proposed method.In addition,a dedicated pre-processing approach is developed for large-sized images to accommodate the rotated,dense,and small-scale characteristics of objects in tunnel images.Experimental results demonstrate that the proposed method achieves a more than 4%improvement in mAP_(50-95)compared to other rotated detectors and a 6.6%-12.7%improvement over mainstream horizontal detectors.Furthermore,the proposed method outperforms mainstream methods by 6.5%-14.7%in detecting leakage bolt holes,underscoring its significant engineering applicability.展开更多
Autonomous driving is a promising way to future safe,efficient,and low-carbon transportation.Real-time ac-curate target detection is an essential precondition for the generation of proper following decision and contro...Autonomous driving is a promising way to future safe,efficient,and low-carbon transportation.Real-time ac-curate target detection is an essential precondition for the generation of proper following decision and control signals.However,considering the complex practical scenarios,accurate recognition of occluded targets is a major challenge of target detection for autonomous driving with limited computational capability.To reveal the overlap and difference between various occluded object detection by sharing the same available sensors,this paper presents a review of detection methods for occluded objects in complex real-driving scenarios.Considering the rapid development of autonomous driving technologies,the research analyzed in this study is limited to the recent five years.The study of occluded object detection is divided into three parts,namely occluded vehicles,pedes-trians and traffic signs.This paper provided a detailed summary of the target detection methods used in these three parts according to the differences in detection methods and ideas,which is followed by the comparison of advantages and disadvantages of different detection methods for the same object.Finally,the shortcomings and limitations of the existing detection methods are summarized,and the challenges and future development prospects in this field are discussed.展开更多
Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the repres...Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the representation of fuzzy features,one of the challenges now is how to effectively detect small objects in images.Existing target detectors for small objects:one is to use high-resolution images as input,the other is to increase the depth of the CNN network,but these two methods will undoubtedly increase the cost of calculation and time-consuming.In this paper,based on the RefineDet network framework,we propose our network structure RF2Det by introducing Receptive Field Block to solve the problem of small object detection,so as to achieve the balance of speed and accuracy.At the same time,we propose a Medium-level Feature Pyramid Networks,which combines appropriate high-level context features with low-level features,so that the network can use the features of both the low-level and the high-level for multi-scale target detection,and the accuracy of the small target detection task based on the low-level features is improved.Extensive experiments on the MS COCO dataset demonstrate that compared to other most advanced methods,our proposed method shows significant performance improvement in the detection of small objects.展开更多
Anchor-based detectors are widely used in object detection.To improve the accuracy of object detection,multiple anchor boxes are intensively placed on the input image,yet.Most of which are invalid.Although the anchor-...Anchor-based detectors are widely used in object detection.To improve the accuracy of object detection,multiple anchor boxes are intensively placed on the input image,yet.Most of which are invalid.Although the anchor-free method can reduce the number of useless anchor boxes,the invalid ones still occupy a high proportion.On this basis,this paper proposes a multiscale center point object detection method based on parallel network to further reduce the number of useless anchor boxes.This study adopts the parallel network architecture of hourglass-104 and darknet-53 of which the first one outputs heatmaps to generate the center point for object feature location on the output attribute feature map of darknet-53.Combining feature pyramid and CIoU loss function,this algorithm is trained and tested on MSCOCO dataset,increasing the detection rate of target location and the accuracy rate of small object detection.Though resembling the state-of-the-art two-stage detectors in overall object detection accuracy,this algorithm is superior in speed.展开更多
An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame dif...An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame difference and adjusted background subtraction. An adaptive threshold technique is employed to automatically choose the threshold value to segment the moving objects from the still background. And experiment results show that the algorithm is effective and efficient in practical situations. Furthermore, the algorithm is robust to the effects of the changing of lighting condition and can be applied for video surveillance system.展开更多
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones...Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.展开更多
Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by ...Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.展开更多
Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially...Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent.展开更多
To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-cap...To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-captured images has posed a challenge for traditional target detection methods,especially in identifying small objects in high-resolution images.This study presents an enhanced object detection algorithm based on the Faster Regionbased Convolutional Neural Network(Faster R-CNN)framework,specifically tailored for detecting small-scale electrical components like insulators,shock hammers,and screws in transmission line.The algorithm features an improved backbone network for Faster R-CNN,which significantly boosts the feature extraction network’s ability to detect fine details.The Region Proposal Network is optimized using a method of guided feature refinement(GFR),which achieves a balance between accuracy and speed.The incorporation of Generalized Intersection over Union(GIOU)and Region of Interest(ROI)Align further refines themodel’s accuracy.Experimental results demonstrate a notable improvement in mean Average Precision,reaching 89.3%,an 11.1%increase compared to the standard Faster R-CNN.This highlights the effectiveness of the proposed algorithm in identifying electrical components in high-resolution aerial images.展开更多
Lunar impact crater detection is crucial for lunar surface studies and spacecraft landing missions,yet deep learning still struggles with accurately detecting small craters,especially when relying on incomplete catalo...Lunar impact crater detection is crucial for lunar surface studies and spacecraft landing missions,yet deep learning still struggles with accurately detecting small craters,especially when relying on incomplete catalogs.In this work,we integrate Digital Elevation Model(DEM)data to construct a high-quality dataset enriched with slope information,enabling a detailed analysis of crater features and effectively improving detection performance in complex terrains and low-contrast areas.Based on this foundation,we propose a novel two-stage detection network,MSFNet,which leverages multi-scale adaptive feature fusion and multisize ROI pooling to enhance the recognition of craters across various scales.Experimental results demonstrate that MSFNet achieves an F1 score of 74.8%on Test Region1 and a recall rate of 87%for craters with diameters larger than 2 km.Moreover,it shows exceptional performance in detecting sub-kilometer craters by successfully identifying a large number of high-confidence,previously unlabeled targets with a low false detection rate confirmed through manual review.This approach offers an efficient and reliable deep learning solution for lunar impact crater detection.展开更多
In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the propos...In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the proposed PV-DT3D,point-voxel fusion features are used for proposal refinement.Specifically,keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module.Subsequently,following the generation of proposals by the region proposal networks(RPN),the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture.In 3D object detection,the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions.Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.展开更多
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f...Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.展开更多
To address the challenges of low detection accuracy caused by the diverse species,significant size variations,and complex growth environments of wheat pests in natural settings,a PSA-YOLO11n algorithm is proposed to e...To address the challenges of low detection accuracy caused by the diverse species,significant size variations,and complex growth environments of wheat pests in natural settings,a PSA-YOLO11n algorithm is proposed to enhance detection precision.Building upon the YOLO11n framework,the proposed improvements include three key components:1)SimCSPSPPF in Backbone:An improved Spatial Pyramid Pooling-Fast(SPPF)module,SimCSPSPPF,is integrated into the Backbone to reduce the number of channels in the hidden layers,thereby accelerating model training.2)PEC in Neck:The standard convolution layers in the Neck are replaced with Perception Enhancement Convolutions(PEC)to improve multi-scale feature extraction capabilities,enhancing detection speed.3)AWIoU Loss Function:The regression loss function is replaced with Adequate Wise IoU(AWIoU),addressing issues of bounding box distortion caused by the diversity in pest species and size variations,thereby improving the precision of bounding box localization.Experimental evaluations on the IP102 dataset demonstrate that PSA-YOLO11n achieves a mean Average Precision(mAP)of 89.10%,surpassing YOLO11n by 0.8%.Comparisons with other mainstream algorithms,including Faster R-CNN,RetinaNet,YOLOv5s,YOLOv8n,YOLOv10n,and YOLO11n,confirm that PSA-YOLO11n outperforms all baselines in terms of detection performance.These results highlight the algorithm’s capability to significantly improve the detection accuracy of multi-scale wheat pests in natural environments,providing an effective solution for pest management in wheat production.展开更多
UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,comp...UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,complex backgrounds,and variable lighting persist due to the unique perspective of UAV imagery.To address these issues,this paper introduces DAFPN-YOLO,an innovative model based on YOLOv8s(You Only Look Once version 8s).Themodel strikes a balance between detection accuracy and speed while reducing parameters,making itwell-suited for multi-object detection tasks from drone perspectives.A key feature of DAFPN-YOLO is the enhanced Drone-AFPN(Adaptive Feature Pyramid Network),which adaptively fuses multi-scale features to optimize feature extraction and enhance spatial and small-object information.To leverage Drone-AFPN’smulti-scale capabilities fully,a dedicated 160×160 small-object detection head was added,significantly boosting detection accuracy for small targets.In the backbone,the C2f_Dual(Cross Stage Partial with Cross-Stage Feature Fusion Dual)module and SPPELAN(Spatial Pyramid Pooling with Enhanced LocalAttentionNetwork)modulewere integrated.These components improve feature extraction and information aggregationwhile reducing parameters and computational complexity,enhancing inference efficiency.Additionally,Shape-IoU(Shape Intersection over Union)is used as the loss function for bounding box regression,enabling more precise shape-based object matching.Experimental results on the VisDrone 2019 dataset demonstrate the effectiveness ofDAFPN-YOLO.Compared to YOLOv8s,the proposedmodel achieves a 5.4 percentage point increase inmAP@0.5,a 3.8 percentage point improvement in mAP@0.5:0.95,and a 17.2%reduction in parameter count.These results highlight DAFPN-YOLO’s advantages in UAV-based object detection,offering valuable insights for applying deep learning to UAV-specific multi-object detection tasks.展开更多
Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating t...Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating the adoption of advanced,automated approaches for improved forest conservation and management.This study explores the application of deep learning-based object detection techniques for individual tree detection in RGB satellite imagery.A dataset of 3157 images was collected and divided into training(2528),validation(495),and testing(134)sets.To enhance model robustness and generalization,data augmentation was applied to the training part of the dataset.Various YOLO-based models,including YOLOv8,YOLOv9,YOLOv10,YOLOv11,and YOLOv12,were evaluated using different hyperparameters and optimization techniques,such as stochastic gradient descent(SGD)and auto-optimization.These models were assessed in terms of detection accuracy and the number of detected trees.The highest-performing model,YOLOv12m,achieved a mean average precision(mAP@50)of 0.908,mAP@50:95 of 0.581,recall of 0.851,precision of 0.852,and an F1-score of 0.847.The results demonstrate that YOLO-based object detection offers a highly efficient,scalable,and accurate solution for individual tree detection in satellite imagery,facilitating improved forest inventory,monitoring,and ecosystem management.This study underscores the potential of AI-driven tree detection to enhance environmental sustainability and support data-driven decision-making in forestry.展开更多
Improving consumer satisfaction with the appearance and surface quality of wood-based products requires inspection methods that are both accurate and efficient.The adoption of artificial intelligence(AI)for surface ev...Improving consumer satisfaction with the appearance and surface quality of wood-based products requires inspection methods that are both accurate and efficient.The adoption of artificial intelligence(AI)for surface evaluation has emerged as a promising solution.Since the visual appeal of wooden products directly impacts their market value and overall business success,effective quality control is crucial.However,conventional inspection techniques often fail to meet performance requirements due to limited accuracy and slow processing times.To address these shortcomings,the authors propose a real-time deep learning-based system for evaluating surface appearance quality.The method integrates object detection and classification within an area attention framework and leverages R-ELAN for advanced fine-tuning.This architecture supports precise identification and classification of multiple objects,even under ambiguous or visually complex conditions.Furthermore,the model is computationally efficient and well-suited to moderate or domain-specific datasets commonly found in industrial inspection tasks.Experimental validation on the Zenodo dataset shows that the model achieves an average precision(AP)of 60.6%,outperforming the current state-of-the-art YOLOv12 model(55.3%),with a fast inference time of approximately 70 milliseconds.These results underscore the potential of AI-powered methods to enhance surface quality inspection in the wood manufacturing sector.展开更多
基金National Defense Pre-research Fund Project(No.KMGY318002531)。
文摘In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.
基金supported in part by the National Natural Science Foundation of China under Grants 61841103,61673164,and 61602397in part by the Natural Science Foundation of Hunan Provincial under Grants 2016JJ2041 and 2019JJ50106+1 种基金in part by the Key Project of Education Department of Hunan Provincial under Grant 18B385and in part by the Graduate Research Innovation Projects of Hunan Province under Grants CX2018B805 and CX2018B813.
文摘Detecting moving objects in the stationary background is an important problem in visual surveillance systems.However,the traditional background subtraction method fails when the background is not completely stationary and involves certain dynamic changes.In this paper,according to the basic steps of the background subtraction method,a novel non-parametric moving object detection method is proposed based on an improved ant colony algorithm by using the Markov random field.Concretely,the contributions are as follows:1)A new nonparametric strategy is utilized to model the background,based on an improved kernel density estimation;this approach uses an adaptive bandwidth,and the fused features combine the colours,gradients and positions.2)A Markov random field method based on this adaptive background model via the constraint of the spatial context is proposed to extract objects.3)The posterior function is maximized efficiently by using an improved ant colony system algorithm.Extensive experiments show that the proposed method demonstrates a better performance than many existing state-of-the-art methods.
基金Supported by the National Natural Science Foundation of China (No.41001285)
文摘Geospatial objects detection within complex environment is a challenging problem in remote sensing area. In this paper, we derive an extension of the Relevance Vector Machine (RVM) technique to multiple kernel version. The proposed method learns an optimal kernel combination and the associated classifier simultaneously. Two feature types are extracted from images, forming basis kernels. Then these basis kernels are weighted combined and resulted the composite kernel exploits interesting points and appearance information of objects simultaneously. Weights and the detection model are finally learnt by a new algorithm. Experimental results show that the proposed method improve detection accuracy to above 88%, yields good interpretation for the selected subset of features and appears sparser than traditional single-kernel RVMs.
文摘The article deals with the experimental studies of atmosphere indistinct radiation structure. The information extraction background of dot size thermal object presence in atmosphere is reasonable. Indistinct generalization of experimental study regularities technique of space-time irregularity radiation structure in infrared wave range is offered. The approach to dot size thermal object detection in atmosphere is proved with a help of threshold method in the thermodynamic and turbulent process conditions, based on the indistinct statement return task solution.
基金support from the National Natural Science Foundation of China(Grant Nos.52025084 and 52408420)the Beijing Natural Science Foundation(Grant No.8244058).
文摘Most image-based object detection methods employ horizontal bounding boxes(HBBs)to capture objects in tunnel images.However,these bounding boxes often fail to effectively enclose objects oriented in arbitrary directions,resulting in reduced accuracy and suboptimal detection performance.Moreover,HBBs cannot provide directional information for rotated objects.This study proposes a rotated detection method for identifying apparent defects in shield tunnels.Specifically,the oriented region-convolutional neural network(oriented R-CNN)is utilized to detect rotated objects in tunnel images.To enhance feature extraction,a novel hybrid backbone combining CNN-based networks with Swin Transformers is proposed.A feature fusion strategy is employed to integrate features extracted from both networks.Additionally,a neck network based on the bidirectional-feature pyramid network(Bi-FPN)is designed to combine multi-scale object features.The bolt hole dataset is curated to evaluate the efficacyof the proposed method.In addition,a dedicated pre-processing approach is developed for large-sized images to accommodate the rotated,dense,and small-scale characteristics of objects in tunnel images.Experimental results demonstrate that the proposed method achieves a more than 4%improvement in mAP_(50-95)compared to other rotated detectors and a 6.6%-12.7%improvement over mainstream horizontal detectors.Furthermore,the proposed method outperforms mainstream methods by 6.5%-14.7%in detecting leakage bolt holes,underscoring its significant engineering applicability.
基金supported by the National Key Research and Devel-opment Program of China under Grant No.2022YFE0102700Dr Yuhan Huang is a recipient of the ARC Discovery Early Career Research Award(DE220100552).
文摘Autonomous driving is a promising way to future safe,efficient,and low-carbon transportation.Real-time ac-curate target detection is an essential precondition for the generation of proper following decision and control signals.However,considering the complex practical scenarios,accurate recognition of occluded targets is a major challenge of target detection for autonomous driving with limited computational capability.To reveal the overlap and difference between various occluded object detection by sharing the same available sensors,this paper presents a review of detection methods for occluded objects in complex real-driving scenarios.Considering the rapid development of autonomous driving technologies,the research analyzed in this study is limited to the recent five years.The study of occluded object detection is divided into three parts,namely occluded vehicles,pedes-trians and traffic signs.This paper provided a detailed summary of the target detection methods used in these three parts according to the differences in detection methods and ideas,which is followed by the comparison of advantages and disadvantages of different detection methods for the same object.Finally,the shortcomings and limitations of the existing detection methods are summarized,and the challenges and future development prospects in this field are discussed.
文摘Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the representation of fuzzy features,one of the challenges now is how to effectively detect small objects in images.Existing target detectors for small objects:one is to use high-resolution images as input,the other is to increase the depth of the CNN network,but these two methods will undoubtedly increase the cost of calculation and time-consuming.In this paper,based on the RefineDet network framework,we propose our network structure RF2Det by introducing Receptive Field Block to solve the problem of small object detection,so as to achieve the balance of speed and accuracy.At the same time,we propose a Medium-level Feature Pyramid Networks,which combines appropriate high-level context features with low-level features,so that the network can use the features of both the low-level and the high-level for multi-scale target detection,and the accuracy of the small target detection task based on the low-level features is improved.Extensive experiments on the MS COCO dataset demonstrate that compared to other most advanced methods,our proposed method shows significant performance improvement in the detection of small objects.
文摘Anchor-based detectors are widely used in object detection.To improve the accuracy of object detection,multiple anchor boxes are intensively placed on the input image,yet.Most of which are invalid.Although the anchor-free method can reduce the number of useless anchor boxes,the invalid ones still occupy a high proportion.On this basis,this paper proposes a multiscale center point object detection method based on parallel network to further reduce the number of useless anchor boxes.This study adopts the parallel network architecture of hourglass-104 and darknet-53 of which the first one outputs heatmaps to generate the center point for object feature location on the output attribute feature map of darknet-53.Combining feature pyramid and CIoU loss function,this algorithm is trained and tested on MSCOCO dataset,increasing the detection rate of target location and the accuracy rate of small object detection.Though resembling the state-of-the-art two-stage detectors in overall object detection accuracy,this algorithm is superior in speed.
文摘An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame difference and adjusted background subtraction. An adaptive threshold technique is employed to automatically choose the threshold value to segment the moving objects from the still background. And experiment results show that the algorithm is effective and efficient in practical situations. Furthermore, the algorithm is robust to the effects of the changing of lighting condition and can be applied for video surveillance system.
基金supported by the National Natural Science Foundation of China(Nos.62276204 and 62203343)the Fundamental Research Funds for the Central Universities(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470).
文摘Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.
文摘Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.
文摘Pulmonary nodules represent an early manifestation of lung cancer.However,pulmonary nodules only constitute a small portion of the overall image,posing challenges for physicians in image interpretation and potentially leading to false positives or missed detections.To solve these problems,the YOLOv8 network is enhanced by adding deformable convolution and atrous spatial pyramid pooling(ASPP),along with the integration of a coordinate attention(CA)mechanism.This allows the network to focus on small targets while expanding the receptive field without losing resolution.At the same time,context information on the target is gathered and feature expression is enhanced by attention modules in different directions.It effectively improves the positioning accuracy and achieves good results on the LUNA16 dataset.Compared with other detection algorithms,it improves the accuracy of pulmonary nodule detection to a certain extent.
基金supported by the Shanghai Science and Technology Innovation Action Plan High-Tech Field Project(Grant No.22511100601)for the year 2022 and Technology Development Fund for People’s Livelihood Research(Research on Transmission Line Deep Foundation Pit Environmental Situation Awareness System Based on Multi-Source Data).
文摘To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-captured images has posed a challenge for traditional target detection methods,especially in identifying small objects in high-resolution images.This study presents an enhanced object detection algorithm based on the Faster Regionbased Convolutional Neural Network(Faster R-CNN)framework,specifically tailored for detecting small-scale electrical components like insulators,shock hammers,and screws in transmission line.The algorithm features an improved backbone network for Faster R-CNN,which significantly boosts the feature extraction network’s ability to detect fine details.The Region Proposal Network is optimized using a method of guided feature refinement(GFR),which achieves a balance between accuracy and speed.The incorporation of Generalized Intersection over Union(GIOU)and Region of Interest(ROI)Align further refines themodel’s accuracy.Experimental results demonstrate a notable improvement in mean Average Precision,reaching 89.3%,an 11.1%increase compared to the standard Faster R-CNN.This highlights the effectiveness of the proposed algorithm in identifying electrical components in high-resolution aerial images.
基金National Natural Science Foundation of China(12103020,12363009)Natural Science Foundation of Jiangxi Province(20224BAB211011)+1 种基金Open Project Program of State Key Laboratory of Lunar and Planetary Sciences(Macao University of Science and Technology)(Macao FDCT grant No.002/2024/SKL)Youth Talent Project of Science and Technology Plan of Ganzhou(2022CXRC9191,2023CYZ26970)。
文摘Lunar impact crater detection is crucial for lunar surface studies and spacecraft landing missions,yet deep learning still struggles with accurately detecting small craters,especially when relying on incomplete catalogs.In this work,we integrate Digital Elevation Model(DEM)data to construct a high-quality dataset enriched with slope information,enabling a detailed analysis of crater features and effectively improving detection performance in complex terrains and low-contrast areas.Based on this foundation,we propose a novel two-stage detection network,MSFNet,which leverages multi-scale adaptive feature fusion and multisize ROI pooling to enhance the recognition of craters across various scales.Experimental results demonstrate that MSFNet achieves an F1 score of 74.8%on Test Region1 and a recall rate of 87%for craters with diameters larger than 2 km.Moreover,it shows exceptional performance in detecting sub-kilometer craters by successfully identifying a large number of high-confidence,previously unlabeled targets with a low false detection rate confirmed through manual review.This approach offers an efficient and reliable deep learning solution for lunar impact crater detection.
基金supported by the Natural Science Foundation of China (No.62103298)the South African National Research Foundation (Nos.132797 and 137951)。
文摘In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the proposed PV-DT3D,point-voxel fusion features are used for proposal refinement.Specifically,keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module.Subsequently,following the generation of proposals by the region proposal networks(RPN),the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture.In 3D object detection,the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions.Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.
基金supported by the National Natural Science Foundation of China(No.62103298)。
文摘Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.
文摘To address the challenges of low detection accuracy caused by the diverse species,significant size variations,and complex growth environments of wheat pests in natural settings,a PSA-YOLO11n algorithm is proposed to enhance detection precision.Building upon the YOLO11n framework,the proposed improvements include three key components:1)SimCSPSPPF in Backbone:An improved Spatial Pyramid Pooling-Fast(SPPF)module,SimCSPSPPF,is integrated into the Backbone to reduce the number of channels in the hidden layers,thereby accelerating model training.2)PEC in Neck:The standard convolution layers in the Neck are replaced with Perception Enhancement Convolutions(PEC)to improve multi-scale feature extraction capabilities,enhancing detection speed.3)AWIoU Loss Function:The regression loss function is replaced with Adequate Wise IoU(AWIoU),addressing issues of bounding box distortion caused by the diversity in pest species and size variations,thereby improving the precision of bounding box localization.Experimental evaluations on the IP102 dataset demonstrate that PSA-YOLO11n achieves a mean Average Precision(mAP)of 89.10%,surpassing YOLO11n by 0.8%.Comparisons with other mainstream algorithms,including Faster R-CNN,RetinaNet,YOLOv5s,YOLOv8n,YOLOv10n,and YOLO11n,confirm that PSA-YOLO11n outperforms all baselines in terms of detection performance.These results highlight the algorithm’s capability to significantly improve the detection accuracy of multi-scale wheat pests in natural environments,providing an effective solution for pest management in wheat production.
基金supported by the National Natural Science Foundation of China(Grant Nos.62101275 and 62101274).
文摘UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,complex backgrounds,and variable lighting persist due to the unique perspective of UAV imagery.To address these issues,this paper introduces DAFPN-YOLO,an innovative model based on YOLOv8s(You Only Look Once version 8s).Themodel strikes a balance between detection accuracy and speed while reducing parameters,making itwell-suited for multi-object detection tasks from drone perspectives.A key feature of DAFPN-YOLO is the enhanced Drone-AFPN(Adaptive Feature Pyramid Network),which adaptively fuses multi-scale features to optimize feature extraction and enhance spatial and small-object information.To leverage Drone-AFPN’smulti-scale capabilities fully,a dedicated 160×160 small-object detection head was added,significantly boosting detection accuracy for small targets.In the backbone,the C2f_Dual(Cross Stage Partial with Cross-Stage Feature Fusion Dual)module and SPPELAN(Spatial Pyramid Pooling with Enhanced LocalAttentionNetwork)modulewere integrated.These components improve feature extraction and information aggregationwhile reducing parameters and computational complexity,enhancing inference efficiency.Additionally,Shape-IoU(Shape Intersection over Union)is used as the loss function for bounding box regression,enabling more precise shape-based object matching.Experimental results on the VisDrone 2019 dataset demonstrate the effectiveness ofDAFPN-YOLO.Compared to YOLOv8s,the proposedmodel achieves a 5.4 percentage point increase inmAP@0.5,a 3.8 percentage point improvement in mAP@0.5:0.95,and a 17.2%reduction in parameter count.These results highlight DAFPN-YOLO’s advantages in UAV-based object detection,offering valuable insights for applying deep learning to UAV-specific multi-object detection tasks.
基金funding from Horizon Europe Framework Programme(HORIZON),call Teaming for Excellence(HORIZON-WIDERA-2022-ACCESS-01-two-stage)-Creation of the centre of excellence in smart forestry“Forest 4.0”No.101059985funded by the EuropeanUnion under the project FOREST 4.0-“Ekscelencijos centras tvariai miško bioekonomikai vystyti”No.10-042-P-0002.
文摘Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating the adoption of advanced,automated approaches for improved forest conservation and management.This study explores the application of deep learning-based object detection techniques for individual tree detection in RGB satellite imagery.A dataset of 3157 images was collected and divided into training(2528),validation(495),and testing(134)sets.To enhance model robustness and generalization,data augmentation was applied to the training part of the dataset.Various YOLO-based models,including YOLOv8,YOLOv9,YOLOv10,YOLOv11,and YOLOv12,were evaluated using different hyperparameters and optimization techniques,such as stochastic gradient descent(SGD)and auto-optimization.These models were assessed in terms of detection accuracy and the number of detected trees.The highest-performing model,YOLOv12m,achieved a mean average precision(mAP@50)of 0.908,mAP@50:95 of 0.581,recall of 0.851,precision of 0.852,and an F1-score of 0.847.The results demonstrate that YOLO-based object detection offers a highly efficient,scalable,and accurate solution for individual tree detection in satellite imagery,facilitating improved forest inventory,monitoring,and ecosystem management.This study underscores the potential of AI-driven tree detection to enhance environmental sustainability and support data-driven decision-making in forestry.
文摘Improving consumer satisfaction with the appearance and surface quality of wood-based products requires inspection methods that are both accurate and efficient.The adoption of artificial intelligence(AI)for surface evaluation has emerged as a promising solution.Since the visual appeal of wooden products directly impacts their market value and overall business success,effective quality control is crucial.However,conventional inspection techniques often fail to meet performance requirements due to limited accuracy and slow processing times.To address these shortcomings,the authors propose a real-time deep learning-based system for evaluating surface appearance quality.The method integrates object detection and classification within an area attention framework and leverages R-ELAN for advanced fine-tuning.This architecture supports precise identification and classification of multiple objects,even under ambiguous or visually complex conditions.Furthermore,the model is computationally efficient and well-suited to moderate or domain-specific datasets commonly found in industrial inspection tasks.Experimental validation on the Zenodo dataset shows that the model achieves an average precision(AP)of 60.6%,outperforming the current state-of-the-art YOLOv12 model(55.3%),with a fast inference time of approximately 70 milliseconds.These results underscore the potential of AI-powered methods to enhance surface quality inspection in the wood manufacturing sector.