Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by ...Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.展开更多
Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dens...Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging.展开更多
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f...Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.展开更多
At present, salient object detection (SOD) has achieved considerable progress. However, the methods that perform well still face the issue of inadequate detection accuracy. For example, sometimes there are problems of...At present, salient object detection (SOD) has achieved considerable progress. However, the methods that perform well still face the issue of inadequate detection accuracy. For example, sometimes there are problems of missed and false detections. Effectively optimizing features to capture key information and better integrating different levels of features to enhance their complementarity are two significant challenges in the domain of SOD. In response to these challenges, this study proposes a novel SOD method based on multi-strategy feature optimization. We propose the multi-size feature extraction module (MSFEM), which uses the attention mechanism, the multi-level feature fusion, and the residual block to obtain finer features. This module provides robust support for the subsequent accurate detection of the salient object. In addition, we use two rounds of feature fusion and the feedback mechanism to optimize the features obtained by the MSFEM to improve detection accuracy. The first round of feature fusion is applied to integrate the features extracted by the MSFEM to obtain more refined features. Subsequently, the feedback mechanism and the second round of feature fusion are applied to refine the features, thereby providing a stronger foundation for accurately detecting salient objects. To improve the fusion effect, we propose the feature enhancement module (FEM) and the feature optimization module (FOM). The FEM integrates the upper and lower features with the optimized features obtained by the FOM to enhance feature complementarity. The FOM uses different receptive fields, the attention mechanism, and the residual block to more effectively capture key information. Experimental results demonstrate that our method outperforms 10 state-of-the-art SOD methods.展开更多
The forecast results of temperature based on the intelligent grids of the Central Meteorological Observatory and the meteorological bureau of the autonomous region and the numerical forecast model of the European Cent...The forecast results of temperature based on the intelligent grids of the Central Meteorological Observatory and the meteorological bureau of the autonomous region and the numerical forecast model of the European Center(EC model)from February to December in 2022 were used.Based on the data of the national intelligent grid forecast,the intelligent grid forecast of the regional bureau,EC model,etc.,temperature was predicted.According to the research of the grid point forecast synthesis algorithm with the highest accuracy rate in the recent three days,the temperature grid point correction was conducted in two forms of stations and grids.In order to reduce the deviation caused by the seasonal system temperature difference,a temperature prediction model was established by using the rolling forecast errors of 5,10,15,20,25 and 30 d as the basis data.The verification and evaluation of objective correction results show that the accuracy rate of temperature forecast by the intelligent grid of the regional bureau,the national intelligent grid,and EC model could be increased by 10%,8%,and 12%,respectively.展开更多
The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations...The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations of YOLO have further enhanced its accuracy and reliability in critical clinical tasks such as tumor detection,lesion segmentation,and microscopic image analysis,thereby accelerating the development of clinical decision support systems.This paper systematically reviews advances in YOLO-based medical object detection from 2018 to 2024.It compares YOLO’s performance with othermodels(e.g.,Faster R-CNN,RetinaNet)inmedical contexts,summarizes standard evaluation metrics(e.g.,mean Average Precision(mAP),sensitivity),and analyzes hardware deployment strategies using public datasets such as LUNA16,BraTS,andCheXpert.Thereviewhighlights the impressive performance of YOLO models,particularly from YOLOv5 to YOLOv8,in achieving high precision(up to 99.17%),sensitivity(up to 97.5%),and mAP exceeding 95%in tasks such as lung nodule,breast cancer,and polyp detection.These results demonstrate the significant potential of YOLO models for early disease detection and real-time clinical applications,indicating their ability to enhance clinical workflows.However,the study also identifies key challenges,including high small-object miss rates,limited generalization in low-contrast images,scarcity of annotated data,and model interpretability issues.Finally,the potential future research directions are also proposed to address these challenges and further advance the application of YOLO models in healthcare.展开更多
Object detection plays a critical role in drone imagery analysis,especially in remote sensing applications where accurate and efficient detection of small objects is essential.Despite significant advancements in drone...Object detection plays a critical role in drone imagery analysis,especially in remote sensing applications where accurate and efficient detection of small objects is essential.Despite significant advancements in drone imagery detection,most models still struggle with small object detection due to challenges such as object size,complex backgrounds.To address these issues,we propose a robust detection model based on You Only Look Once(YOLO)that balances accuracy and efficiency.The model mainly contains several major innovation:feature selection pyramid network,Inner-Shape Intersection over Union(ISIoU)loss function and small object detection head.To overcome the limitations of traditional fusion methods in handling multi-level features,we introduce a Feature Selection Pyramid Network integrated into the Neck component,which preserves shallow feature details critical for detecting small objects.Additionally,recognizing that deep network structures often neglect or degrade small object features,we design a specialized small object detection head in the shallow layers to enhance detection accuracy for these challenging targets.To effectively model both local and global dependencies,we introduce a Conv-Former module that simulates Transformer mechanisms using a convolutional structure,thereby improving feature enhancement.Furthermore,we employ ISIoU to address object imbalance and scale variation This approach accelerates model conver-gence and improves regression accuracy.Experimental results show that,compared to the baseline model,the proposed method significantly improves small object detection performance on the VisDrone2019 dataset,with mAP@50 increasing by 4.9%and mAP@50-95 rising by 6.7%.This model also outperforms other state-of-the-art algorithms,demonstrating its reliability and effectiveness in both small object detection and remote sensing image fusion tasks.展开更多
Unlike traditional video cameras,event cameras capture asynchronous event streams in which each event encodes pixel location,triggers’timestamps,and the polarity of brightness changes.In this paper,we introduce a nov...Unlike traditional video cameras,event cameras capture asynchronous event streams in which each event encodes pixel location,triggers’timestamps,and the polarity of brightness changes.In this paper,we introduce a novel hypergraph-based framework for moving object classification.Specifically,we capture moving objects with an event camera,to perceive and collect asynchronous event streams in a high temporal resolution.Unlike stacked event frames,we encode asynchronous event data into a hypergraph,fully mining the high-order correlation of event data,and designing a mixed convolutional hypergraph neural network for training to achieve a more efficient and accurate motion target recognition.The experimental results show that our method has a good performance in moving object classification(e.g.,gait identification).展开更多
The Internet of Things (IoT) integrates diverse devices into the Internet infrastructure, including sensors, meters, and wearable devices. Designing efficient IoT networks with these heterogeneous devices requires the...The Internet of Things (IoT) integrates diverse devices into the Internet infrastructure, including sensors, meters, and wearable devices. Designing efficient IoT networks with these heterogeneous devices requires the selection of appropriate routing protocols, which is crucial for maintaining high Quality of Service (QoS). The Internet Engineering Task Force’s Routing Over Low Power and Lossy Networks (IETF ROLL) working group developed the IPv6 Routing Protocol for Low Power and Lossy Networks (RPL) to meet these needs. While the initial RPL standard focused on single-metric route selection, ongoing research explores enhancing RPL by incorporating multiple routing metrics and developing new Objective Functions (OFs). This paper introduces a novel Objective Function (OF), the Reliable and Secure Objective Function (RSOF), designed to enhance the reliability and trustworthiness of parent selection at both the node and link levels within IoT and RPL routing protocols. The RSOF employs an adaptive parent node selection mechanism that incorporates multiple metrics, including Residual Energy (RE), Expected Transmission Count (ETX), Extended RPL Node Trustworthiness (ERNT), and a novel metric that measures node failure rate (NFR). In this mechanism, nodes with a high NFR are excluded from the parent selection process to improve network reliability and stability. The proposed RSOF was evaluated using random and grid topologies in the Cooja Simulator, with tests conducted across small, medium, and large-scale networks to examine the impact of varying node densities. The simulation results indicate a significant improvement in network performance, particularly in terms of average latency, packet acknowledgment ratio (PAR), packet delivery ratio (PDR), and Control Message Overhead (CMO), compared to the standard Minimum Rank with Hysteresis Objective Function (MRHOF).展开更多
The Intelligent Transportation System(ITS),as a vital means to alleviate traffic congestion and reduce traffic accidents,demonstrates immense potential in improving traffic safety and efficiency through the integratio...The Intelligent Transportation System(ITS),as a vital means to alleviate traffic congestion and reduce traffic accidents,demonstrates immense potential in improving traffic safety and efficiency through the integration of Internet of Things(IoT)technologies.The enhancement of its performance largely depends on breakthrough advancements in object detection technology.However,current object detection technology still faces numerous challenges,such as accuracy,robustness,and data privacy issues.These challenges are particularly critical in the application of ITS and require in-depth analysis and exploration of future improvement directions.This study provides a comprehensive review of the development of object detection technology and analyzes its specific applications in ITS,aiming to thoroughly explore the use and advancement of object detection technologies in IoT-based intelligent transportation systems.To achieve this objective,we adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)approach to search,screen,and assess the eligibility of relevant literature,ultimately including 88 studies.Through an analysis of these studies,we summarized the characteristics,advantages,and limitations of object detection technology across the traditional methods stage and the deep learning-based methods stage.Additionally,we examined its applications in ITS from three perspectives:vehicle detection,pedestrian detection,and traffic sign detection.We also identified the major challenges currently faced by these technologies and proposed future directions for addressing these issues.This review offers researchers a comprehensive perspective,identifying potential improvement directions for object detection technology in ITS,including accuracy,robustness,real-time performance,data annotation cost,and data privacy.In doing so,it provides significant guidance for the further development of IoT-based intelligent transportation systems.展开更多
To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved Y...To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.展开更多
UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,comp...UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,complex backgrounds,and variable lighting persist due to the unique perspective of UAV imagery.To address these issues,this paper introduces DAFPN-YOLO,an innovative model based on YOLOv8s(You Only Look Once version 8s).Themodel strikes a balance between detection accuracy and speed while reducing parameters,making itwell-suited for multi-object detection tasks from drone perspectives.A key feature of DAFPN-YOLO is the enhanced Drone-AFPN(Adaptive Feature Pyramid Network),which adaptively fuses multi-scale features to optimize feature extraction and enhance spatial and small-object information.To leverage Drone-AFPN’smulti-scale capabilities fully,a dedicated 160×160 small-object detection head was added,significantly boosting detection accuracy for small targets.In the backbone,the C2f_Dual(Cross Stage Partial with Cross-Stage Feature Fusion Dual)module and SPPELAN(Spatial Pyramid Pooling with Enhanced LocalAttentionNetwork)modulewere integrated.These components improve feature extraction and information aggregationwhile reducing parameters and computational complexity,enhancing inference efficiency.Additionally,Shape-IoU(Shape Intersection over Union)is used as the loss function for bounding box regression,enabling more precise shape-based object matching.Experimental results on the VisDrone 2019 dataset demonstrate the effectiveness ofDAFPN-YOLO.Compared to YOLOv8s,the proposedmodel achieves a 5.4 percentage point increase inmAP@0.5,a 3.8 percentage point improvement in mAP@0.5:0.95,and a 17.2%reduction in parameter count.These results highlight DAFPN-YOLO’s advantages in UAV-based object detection,offering valuable insights for applying deep learning to UAV-specific multi-object detection tasks.展开更多
Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,...Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.展开更多
With the rapid development of flexible electronics,the tactile systems for object recognition are becoming increasingly delicate.This paper presents the design of a tactile glove for object recognition,integrating 243...With the rapid development of flexible electronics,the tactile systems for object recognition are becoming increasingly delicate.This paper presents the design of a tactile glove for object recognition,integrating 243 palm pressure units and 126 finger joint strain units that are implemented by piezoresistive Velostat film.The palm pressure and joint bending strain data from the glove were collected using a two-dimensional resistance array scanning circuit and further converted into tactile images with a resolution of 32×32.To verify the effect of tactile data types on recognition precision,three datasets of tactile images were respectively built by palm pressure data,joint bending strain data,and a tactile data combing of both palm pressure and joint bending strain.An improved residual convolutional neural network(CNN)model,SP-ResNet,was developed by light-weighting ResNet-18 to classify these tactile images.Experimental results show that the data collection method combining palm pressure and joint bending strain demonstrates a 4.33%improvement in recognition precision compared to the best results obtained by using only palm pressure or joint bending strain.The recognition precision of 95.50%for 16 objects can be achieved by the presented tactile glove with SP-ResNet of less computation cost.The presented tactile system can serve as a sensing platform for intelligent prosthetics and robot grippers.展开更多
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t...Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.展开更多
To accomplish the reliability analyses of the correlation of multi-analytical objectives,an innovative framework of Dimensional Synchronous Modeling(DSM)and correlation analysis is developed based on the stepwise mode...To accomplish the reliability analyses of the correlation of multi-analytical objectives,an innovative framework of Dimensional Synchronous Modeling(DSM)and correlation analysis is developed based on the stepwise modeling strategy,cell array operation principle,and Copula theory.Under this framework,we propose a DSM-based Enhanced Kriging(DSMEK)algorithm to synchronously derive the modeling of multi-objective,and explore an adaptive Copula function approach to analyze the correlation among multiple objectives and to assess the synthetical reliability level.In the proposed DSMEK and adaptive Copula methods,the Kriging model is treated as the basis function of DSMEK model,the Multi-Objective Snake Optimizer(MOSO)algorithm is used to search the optimal values of hyperparameters of basis functions,the cell array operation principle is adopted to establish a whole model of multiple objectives,the goodness of fit is utilized to determine the forms of Copula functions,and the determined Copula functions are employed to perform the reliability analyses of the correlation of multi-analytical objectives.Furthermore,three examples,including multi-objective complex function approximation,aeroengine turbine bladeddisc multi-failure mode reliability analyses and aircraft landing gear system brake temperature reliability analyses,are performed to verify the effectiveness of the proposed methods,from the viewpoints of mathematics and engineering.The results show that the DSMEK and adaptive Copula approaches hold obvious advantages in terms of modeling features and simulation performance.The efforts of this work provide a useful way for the modeling of multi-analytical objectives and synthetical reliability analyses of complex structure/system with multi-output responses.展开更多
Aiming at the problem of low surface defect detection accuracy of industrial products, an object detection method based on simplified spatial pyramid pooling fast(Sim SPPF) hybrid pooling improved you only look once v...Aiming at the problem of low surface defect detection accuracy of industrial products, an object detection method based on simplified spatial pyramid pooling fast(Sim SPPF) hybrid pooling improved you only look once version 5s(YOLOV5s) model is proposed. The algorithm introduces channel attention(CA) module, simplified SPPF feature vector pyramid and efficient intersection over union(EIOU) loss function. Feature vector pyramids fuse high-dimensional and low-dimensional features, which makes semantic information richer. The CA mechanism performs maximum pooling and average pooling operations on the feature map. Hybrid pooling comprehensively improves detection computing efficiency and accurate deployment ability. The results show that the improved YOLOV5s model is better than the original YOLOV5s model. The average test accuracy(mAP) can reach 91.8%, which can be increased by 17.4%, and the detection speed can reach 108 FPS, which can be increased by 18 FPS. The improved model is practicable, and the overall performance is better than other conventional models.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re...In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.展开更多
文摘Deep learning-based object detection has revolutionized various fields,including agriculture.This paper presents a systematic review based on the PRISMA 2020 approach for object detection techniques in agriculture by exploring the evolution of different methods and applications over the past three years,highlighting the shift from conventional computer vision to deep learning-based methodologies owing to their enhanced efficacy in real time.The review emphasizes the integration of advanced models,such as You Only Look Once(YOLO)v9,v10,EfficientDet,Transformer-based models,and hybrid frameworks that improve the precision,accuracy,and scalability for crop monitoring and disease detection.The review also highlights benchmark datasets and evaluation metrics.It addresses limitations,like domain adaptation challenges,dataset heterogeneity,and occlusion,while offering insights into prospective research avenues,such as multimodal learning,explainable AI,and federated learning.Furthermore,the main aim of this paper is to serve as a thorough resource guide for scientists,researchers,and stakeholders for implementing deep learning-based object detection methods for the development of intelligent,robust,and sustainable agricultural systems.
基金supported in part by the National Science Foundation of China(52371372)the Project of Science and Technology Commission of Shanghai Municipality,China(22JC1401400,21190780300)the 111 Project,China(D18003)
文摘Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging.
基金supported by the National Natural Science Foundation of China(No.62103298)。
文摘Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.
文摘At present, salient object detection (SOD) has achieved considerable progress. However, the methods that perform well still face the issue of inadequate detection accuracy. For example, sometimes there are problems of missed and false detections. Effectively optimizing features to capture key information and better integrating different levels of features to enhance their complementarity are two significant challenges in the domain of SOD. In response to these challenges, this study proposes a novel SOD method based on multi-strategy feature optimization. We propose the multi-size feature extraction module (MSFEM), which uses the attention mechanism, the multi-level feature fusion, and the residual block to obtain finer features. This module provides robust support for the subsequent accurate detection of the salient object. In addition, we use two rounds of feature fusion and the feedback mechanism to optimize the features obtained by the MSFEM to improve detection accuracy. The first round of feature fusion is applied to integrate the features extracted by the MSFEM to obtain more refined features. Subsequently, the feedback mechanism and the second round of feature fusion are applied to refine the features, thereby providing a stronger foundation for accurately detecting salient objects. To improve the fusion effect, we propose the feature enhancement module (FEM) and the feature optimization module (FOM). The FEM integrates the upper and lower features with the optimized features obtained by the FOM to enhance feature complementarity. The FOM uses different receptive fields, the attention mechanism, and the residual block to more effectively capture key information. Experimental results demonstrate that our method outperforms 10 state-of-the-art SOD methods.
文摘The forecast results of temperature based on the intelligent grids of the Central Meteorological Observatory and the meteorological bureau of the autonomous region and the numerical forecast model of the European Center(EC model)from February to December in 2022 were used.Based on the data of the national intelligent grid forecast,the intelligent grid forecast of the regional bureau,EC model,etc.,temperature was predicted.According to the research of the grid point forecast synthesis algorithm with the highest accuracy rate in the recent three days,the temperature grid point correction was conducted in two forms of stations and grids.In order to reduce the deviation caused by the seasonal system temperature difference,a temperature prediction model was established by using the rolling forecast errors of 5,10,15,20,25 and 30 d as the basis data.The verification and evaluation of objective correction results show that the accuracy rate of temperature forecast by the intelligent grid of the regional bureau,the national intelligent grid,and EC model could be increased by 10%,8%,and 12%,respectively.
基金supported by the National Natural Science Foundation of China under grant number 62066016the Natural Science Foundation of Hunan Province of China under grant number 2024JJ7395+2 种基金the Scientific Research Project of Education Department of Hunan Province of China under grant number 22B0549International and Regional Science and Technology Cooperation and Exchange Program of the Hunan Association for Science and Technology under grant number 025SKX-KJ-04Hunan Province Undergraduate Innovation and Entrepreneurship Training Program(grant number S202410531015).
文摘The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations of YOLO have further enhanced its accuracy and reliability in critical clinical tasks such as tumor detection,lesion segmentation,and microscopic image analysis,thereby accelerating the development of clinical decision support systems.This paper systematically reviews advances in YOLO-based medical object detection from 2018 to 2024.It compares YOLO’s performance with othermodels(e.g.,Faster R-CNN,RetinaNet)inmedical contexts,summarizes standard evaluation metrics(e.g.,mean Average Precision(mAP),sensitivity),and analyzes hardware deployment strategies using public datasets such as LUNA16,BraTS,andCheXpert.Thereviewhighlights the impressive performance of YOLO models,particularly from YOLOv5 to YOLOv8,in achieving high precision(up to 99.17%),sensitivity(up to 97.5%),and mAP exceeding 95%in tasks such as lung nodule,breast cancer,and polyp detection.These results demonstrate the significant potential of YOLO models for early disease detection and real-time clinical applications,indicating their ability to enhance clinical workflows.However,the study also identifies key challenges,including high small-object miss rates,limited generalization in low-contrast images,scarcity of annotated data,and model interpretability issues.Finally,the potential future research directions are also proposed to address these challenges and further advance the application of YOLO models in healthcare.
文摘Object detection plays a critical role in drone imagery analysis,especially in remote sensing applications where accurate and efficient detection of small objects is essential.Despite significant advancements in drone imagery detection,most models still struggle with small object detection due to challenges such as object size,complex backgrounds.To address these issues,we propose a robust detection model based on You Only Look Once(YOLO)that balances accuracy and efficiency.The model mainly contains several major innovation:feature selection pyramid network,Inner-Shape Intersection over Union(ISIoU)loss function and small object detection head.To overcome the limitations of traditional fusion methods in handling multi-level features,we introduce a Feature Selection Pyramid Network integrated into the Neck component,which preserves shallow feature details critical for detecting small objects.Additionally,recognizing that deep network structures often neglect or degrade small object features,we design a specialized small object detection head in the shallow layers to enhance detection accuracy for these challenging targets.To effectively model both local and global dependencies,we introduce a Conv-Former module that simulates Transformer mechanisms using a convolutional structure,thereby improving feature enhancement.Furthermore,we employ ISIoU to address object imbalance and scale variation This approach accelerates model conver-gence and improves regression accuracy.Experimental results show that,compared to the baseline model,the proposed method significantly improves small object detection performance on the VisDrone2019 dataset,with mAP@50 increasing by 4.9%and mAP@50-95 rising by 6.7%.This model also outperforms other state-of-the-art algorithms,demonstrating its reliability and effectiveness in both small object detection and remote sensing image fusion tasks.
基金the National Key Research and Development Program of China(No.2021ZD0112400)。
文摘Unlike traditional video cameras,event cameras capture asynchronous event streams in which each event encodes pixel location,triggers’timestamps,and the polarity of brightness changes.In this paper,we introduce a novel hypergraph-based framework for moving object classification.Specifically,we capture moving objects with an event camera,to perceive and collect asynchronous event streams in a high temporal resolution.Unlike stacked event frames,we encode asynchronous event data into a hypergraph,fully mining the high-order correlation of event data,and designing a mixed convolutional hypergraph neural network for training to achieve a more efficient and accurate motion target recognition.The experimental results show that our method has a good performance in moving object classification(e.g.,gait identification).
文摘The Internet of Things (IoT) integrates diverse devices into the Internet infrastructure, including sensors, meters, and wearable devices. Designing efficient IoT networks with these heterogeneous devices requires the selection of appropriate routing protocols, which is crucial for maintaining high Quality of Service (QoS). The Internet Engineering Task Force’s Routing Over Low Power and Lossy Networks (IETF ROLL) working group developed the IPv6 Routing Protocol for Low Power and Lossy Networks (RPL) to meet these needs. While the initial RPL standard focused on single-metric route selection, ongoing research explores enhancing RPL by incorporating multiple routing metrics and developing new Objective Functions (OFs). This paper introduces a novel Objective Function (OF), the Reliable and Secure Objective Function (RSOF), designed to enhance the reliability and trustworthiness of parent selection at both the node and link levels within IoT and RPL routing protocols. The RSOF employs an adaptive parent node selection mechanism that incorporates multiple metrics, including Residual Energy (RE), Expected Transmission Count (ETX), Extended RPL Node Trustworthiness (ERNT), and a novel metric that measures node failure rate (NFR). In this mechanism, nodes with a high NFR are excluded from the parent selection process to improve network reliability and stability. The proposed RSOF was evaluated using random and grid topologies in the Cooja Simulator, with tests conducted across small, medium, and large-scale networks to examine the impact of varying node densities. The simulation results indicate a significant improvement in network performance, particularly in terms of average latency, packet acknowledgment ratio (PAR), packet delivery ratio (PDR), and Control Message Overhead (CMO), compared to the standard Minimum Rank with Hysteresis Objective Function (MRHOF).
文摘The Intelligent Transportation System(ITS),as a vital means to alleviate traffic congestion and reduce traffic accidents,demonstrates immense potential in improving traffic safety and efficiency through the integration of Internet of Things(IoT)technologies.The enhancement of its performance largely depends on breakthrough advancements in object detection technology.However,current object detection technology still faces numerous challenges,such as accuracy,robustness,and data privacy issues.These challenges are particularly critical in the application of ITS and require in-depth analysis and exploration of future improvement directions.This study provides a comprehensive review of the development of object detection technology and analyzes its specific applications in ITS,aiming to thoroughly explore the use and advancement of object detection technologies in IoT-based intelligent transportation systems.To achieve this objective,we adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)approach to search,screen,and assess the eligibility of relevant literature,ultimately including 88 studies.Through an analysis of these studies,we summarized the characteristics,advantages,and limitations of object detection technology across the traditional methods stage and the deep learning-based methods stage.Additionally,we examined its applications in ITS from three perspectives:vehicle detection,pedestrian detection,and traffic sign detection.We also identified the major challenges currently faced by these technologies and proposed future directions for addressing these issues.This review offers researchers a comprehensive perspective,identifying potential improvement directions for object detection technology in ITS,including accuracy,robustness,real-time performance,data annotation cost,and data privacy.In doing so,it provides significant guidance for the further development of IoT-based intelligent transportation systems.
文摘To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.
基金supported by the National Natural Science Foundation of China(Grant Nos.62101275 and 62101274).
文摘UAV-based object detection is rapidly expanding in both civilian and military applications,including security surveillance,disaster assessment,and border patrol.However,challenges such as small objects,occlusions,complex backgrounds,and variable lighting persist due to the unique perspective of UAV imagery.To address these issues,this paper introduces DAFPN-YOLO,an innovative model based on YOLOv8s(You Only Look Once version 8s).Themodel strikes a balance between detection accuracy and speed while reducing parameters,making itwell-suited for multi-object detection tasks from drone perspectives.A key feature of DAFPN-YOLO is the enhanced Drone-AFPN(Adaptive Feature Pyramid Network),which adaptively fuses multi-scale features to optimize feature extraction and enhance spatial and small-object information.To leverage Drone-AFPN’smulti-scale capabilities fully,a dedicated 160×160 small-object detection head was added,significantly boosting detection accuracy for small targets.In the backbone,the C2f_Dual(Cross Stage Partial with Cross-Stage Feature Fusion Dual)module and SPPELAN(Spatial Pyramid Pooling with Enhanced LocalAttentionNetwork)modulewere integrated.These components improve feature extraction and information aggregationwhile reducing parameters and computational complexity,enhancing inference efficiency.Additionally,Shape-IoU(Shape Intersection over Union)is used as the loss function for bounding box regression,enabling more precise shape-based object matching.Experimental results on the VisDrone 2019 dataset demonstrate the effectiveness ofDAFPN-YOLO.Compared to YOLOv8s,the proposedmodel achieves a 5.4 percentage point increase inmAP@0.5,a 3.8 percentage point improvement in mAP@0.5:0.95,and a 17.2%reduction in parameter count.These results highlight DAFPN-YOLO’s advantages in UAV-based object detection,offering valuable insights for applying deep learning to UAV-specific multi-object detection tasks.
基金supported by the National Natural Science Foundation of China under Grant 62306128the Basic Science Research Project of Jiangsu Provincial Department of Education under Grant 23KJD520003the Leading Innovation Project of Changzhou Science and Technology Bureau under Grant CQ20230072.
文摘Real-time detection of surface defects on cables is crucial for ensuring the safe operation of power systems.However,existing methods struggle with small target sizes,complex backgrounds,low-quality image acquisition,and interference from contamination.To address these challenges,this paper proposes the Real-time Cable Defect Detection Network(RC2DNet),which achieves an optimal balance between detection accuracy and computational efficiency.Unlike conventional approaches,RC2DNet introduces a small object feature extraction module that enhances the semantic representation of small targets through feature pyramids,multi-level feature fusion,and an adaptive weighting mechanism.Additionally,a boundary feature enhancement module is designed,incorporating boundary-aware convolution,a novel boundary attention mechanism,and an improved loss function to significantly enhance boundary localization accuracy.Experimental results demonstrate that RC2DNet outperforms state-of-the-art methods in precision,recall,F1-score,mean Intersection over Union(mIoU),and frame rate,enabling real-time and highly accurate cable defect detection in complex backgrounds.
基金supported by the Key Research and Development Program of Shaanxi Province(No.2024 GX-YBXM-178)the Shaanxi Province Qinchuangyuan“Scientists+Engineers”Team Development(No.2022KXJ032)。
文摘With the rapid development of flexible electronics,the tactile systems for object recognition are becoming increasingly delicate.This paper presents the design of a tactile glove for object recognition,integrating 243 palm pressure units and 126 finger joint strain units that are implemented by piezoresistive Velostat film.The palm pressure and joint bending strain data from the glove were collected using a two-dimensional resistance array scanning circuit and further converted into tactile images with a resolution of 32×32.To verify the effect of tactile data types on recognition precision,three datasets of tactile images were respectively built by palm pressure data,joint bending strain data,and a tactile data combing of both palm pressure and joint bending strain.An improved residual convolutional neural network(CNN)model,SP-ResNet,was developed by light-weighting ResNet-18 to classify these tactile images.Experimental results show that the data collection method combining palm pressure and joint bending strain demonstrates a 4.33%improvement in recognition precision compared to the best results obtained by using only palm pressure or joint bending strain.The recognition precision of 95.50%for 16 objects can be achieved by the presented tactile glove with SP-ResNet of less computation cost.The presented tactile system can serve as a sensing platform for intelligent prosthetics and robot grippers.
文摘Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.
基金co-supported by the National Natural Science Foundation of China(Nos.52405293,52375237)China Postdoctoral Science Foundation(No.2024M754219)Shaanxi Province Postdoctoral Research Project Funding,China。
文摘To accomplish the reliability analyses of the correlation of multi-analytical objectives,an innovative framework of Dimensional Synchronous Modeling(DSM)and correlation analysis is developed based on the stepwise modeling strategy,cell array operation principle,and Copula theory.Under this framework,we propose a DSM-based Enhanced Kriging(DSMEK)algorithm to synchronously derive the modeling of multi-objective,and explore an adaptive Copula function approach to analyze the correlation among multiple objectives and to assess the synthetical reliability level.In the proposed DSMEK and adaptive Copula methods,the Kriging model is treated as the basis function of DSMEK model,the Multi-Objective Snake Optimizer(MOSO)algorithm is used to search the optimal values of hyperparameters of basis functions,the cell array operation principle is adopted to establish a whole model of multiple objectives,the goodness of fit is utilized to determine the forms of Copula functions,and the determined Copula functions are employed to perform the reliability analyses of the correlation of multi-analytical objectives.Furthermore,three examples,including multi-objective complex function approximation,aeroengine turbine bladeddisc multi-failure mode reliability analyses and aircraft landing gear system brake temperature reliability analyses,are performed to verify the effectiveness of the proposed methods,from the viewpoints of mathematics and engineering.The results show that the DSMEK and adaptive Copula approaches hold obvious advantages in terms of modeling features and simulation performance.The efforts of this work provide a useful way for the modeling of multi-analytical objectives and synthetical reliability analyses of complex structure/system with multi-output responses.
基金supported by the Tianjin Postgraduate Research Innovation Project (No.2022SKY286)the National Science and the National Key Research and Development Program (No.2022YFF0706000)。
文摘Aiming at the problem of low surface defect detection accuracy of industrial products, an object detection method based on simplified spatial pyramid pooling fast(Sim SPPF) hybrid pooling improved you only look once version 5s(YOLOV5s) model is proposed. The algorithm introduces channel attention(CA) module, simplified SPPF feature vector pyramid and efficient intersection over union(EIOU) loss function. Feature vector pyramids fuse high-dimensional and low-dimensional features, which makes semantic information richer. The CA mechanism performs maximum pooling and average pooling operations on the feature map. Hybrid pooling comprehensively improves detection computing efficiency and accurate deployment ability. The results show that the improved YOLOV5s model is better than the original YOLOV5s model. The average test accuracy(mAP) can reach 91.8%, which can be increased by 17.4%, and the detection speed can reach 108 FPS, which can be increased by 18 FPS. The improved model is practicable, and the overall performance is better than other conventional models.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金supported in part by the National Natural Science Foundation of China under Grant 62006071part by the Science and Technology Research Project of Henan Province under Grant 232103810086.
文摘In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.