期刊文献+
共找到585篇文章
< 1 2 30 >
每页显示 20 50 100
Global-local feature optimization based RGB-IR fusion object detection on drone view 被引量:1
1
作者 Zhaodong CHEN Hongbing JI Yongquan ZHANG 《Chinese Journal of Aeronautics》 2026年第1期436-453,共18页
Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still st... Visible and infrared(RGB-IR)fusion object detection plays an important role in security,disaster relief,etc.In recent years,deep-learning-based RGB-IR fusion detection methods have been developing rapidly,but still struggle to deal with the complex and changing scenarios captured by drones,mainly due to two reasons:(A)RGB-IR fusion detectors are susceptible to inferior inputs that degrade performance and stability.(B)RGB-IR fusion detectors are susceptible to redundant features that reduce accuracy and efficiency.In this paper,an innovative RGB-IR fusion detection framework based on global-local feature optimization,named GLFDet,is proposed to improve the detection performance and efficiency of drone-captured objects.The key components of GLFDet include a Global Feature Optimization(GFO)module,a Local Feature Optimization(LFO)module and a Channel Separation Fusion(CSF)module.Specifically,GFO calculates the information content of the input image from the frequency domain and optimizes the features holistically.Then,LFO dynamically selects high-value features and filters out low-value features before fusion,which significantly improves the efficiency of fusion.Finally,CSF fuses the RGB and IR features across the corresponding channels,which avoids the rearrangement of the channel relationships and enhances the model stability.Extensive experimental results show that the proposed method achieves the best performance on three popular RGB-IR datasets Drone Vehicle,VEDAI,and LLVIP.In addition,GLFDet is more lightweight than other comparable models,making it more appealing to edge devices such as drones.The code is available at https://github.com/lao chen330/GLFDet. 展开更多
关键词 Object detection Deep learning RGB-IR fusion DRONES Global feature Local feature
原文传递
YOLO-DS:a detection model for desert shrub identification and coverage estimation in UAV remote sensing
2
作者 Weifan Xu Huifang Zhang +6 位作者 Yan Zhang Kangshuo Liu Jinglu Zhang Yali Zhu Baoerhan Dilixiati Jifeng Ning Jian Gao 《Journal of Forestry Research》 2026年第1期242-255,共14页
Desert shrubs are indispensable in maintaining ecological stability by reducing soil erosion,enhancing water retention,and boosting soil fertility,which are critical factors in mitigating desertification processes.Due... Desert shrubs are indispensable in maintaining ecological stability by reducing soil erosion,enhancing water retention,and boosting soil fertility,which are critical factors in mitigating desertification processes.Due to the complex topography,variable climate,and challenges in field surveys in desert regions,this paper proposes YOLO-Desert-Shrub(YOLO-DS),a detection method for identifying desert shrubs in UAV remote sensing images based on an enhanced YOLOv8n framework.This method accurately identifying shrub species,locations,and coverage.To address the issue of small individual plants dominating the dataset,the SPDconv convolution module is introduced in the Backbone and Neck layers of the YOLOv8n model,replacing conventional convolutions.This structural optimization mitigates information degradation in fine-grained data while strengthening discriminative feature capture across spatial scales within desert shrub datasets.Furthermore,a structured state-space model is integrated into the main network,and the MambaLayer is designed to dynamically extract and refine shrub-specific features from remote sensing images,effectively filtering out background noise and irrelevant interference to enhance feature representation.Benchmark evaluations reveal the YOLO-DS framework attains 79.56%mAP40weight,demonstrating 2.2%absolute gain versus the baseline YOLOv8n architecture,with statistically significant advantages over contemporary detectors in cross-validation trials.The predicted plant coverage exhibits strong consistency with manually measured coverage,with a coefficient of determination(R^(2))of 0.9148 and a Root Mean Square Error(RMSE)of1.8266%.The proposed UAV-based remote sensing method utilizing the YOLO-DS effectively identify and locate desert shrubs,monitor canopy sizes and distribution,and provide technical support for automated desert shrub monitoring. 展开更多
关键词 Desert shrubs Deep learning Object detection UAV remote sensing YOLOv8 Mamba
在线阅读 下载PDF
A Dual-Detection Method for Cashew Ripeness and Anthrax Based on YOLOv11-NSDDil
3
作者 Ran Liu Yawen Chen +1 位作者 Dong Yang Jingjing Yang 《Computers, Materials & Continua》 2026年第2期1919-1941,共23页
In the field of smart agriculture,accurate and efficient object detection technology is crucial for automated crop management.A particularly challenging task in this domain is small object detection,such as the identi... In the field of smart agriculture,accurate and efficient object detection technology is crucial for automated crop management.A particularly challenging task in this domain is small object detection,such as the identification of immature fruits or early stage disease spots.These objects pose significant difficulties due to their small pixel coverage,limited feature information,substantial scale variations,and high susceptibility to complex background interference.These challenges frequently result in inadequate accuracy and robustness in current detection models.This study addresses two critical needs in the cashew cultivation industry—fruitmaturity and anthracnose detection—by proposing an improved YOLOv11-NSDDil model.The method introduces three key technological innovations:(1)The SDDil module is designed and integrated into the backbone network.This module combines depthwise separable convolution with the SimAM attention mechanism to expand the receptive field and enhance contextual semantic capture at a low computational cost,effectively alleviating the feature deficiency problem caused by limited pixel coverage of small objects.Simultaneously,the SDmodule dynamically enhances discriminative features and suppresses background noise,significantly improving the model’s feature discrimination capability in complex environments;(2)The introduction of the DynamicScalSeq-Zoom_cat neck network,significantly improving multi-scale feature fusion;and(3)The optimization of the Minimum Point Distance Intersection over Union(MPDIoU)loss function,which enhances bounding box localization accuracy byminimizing vertex distance.Experimental results on a self-constructed cashew dataset containing 1123 images demonstrate significant performance improvements in the enhanced model:mAP50 reaches 0.825,a 4.6% increase compared to the originalYOLOv11;mAP50-95 improves to 0.624,a 6.5% increase;and recall rises to 0.777,a 2.4%increase.This provides a reliable technical solution for intelligent quality inspection of agricultural products and holds broad application prospects. 展开更多
关键词 Deep learning object detection multi-scale fusion small object detection miss detection false detection
在线阅读 下载PDF
YOLOv10-HQGNN:A Hybrid Quantum Graph Learning Framework for Real-Time Faulty Insulator Detection
4
作者 Nghia Dinh Vinh Truong Hoang +6 位作者 Viet-Tuan Le Kiet Tran-Trung Ha Duong Thi Hong Bay Nguyen Van Hau Nguyen Trung Thien Ho Huong Kittikhun Meethongjan 《Computers, Materials & Continua》 2026年第3期1747-1769,共23页
Ensuring the reliability of power transmission networks depends heavily on the early detection of faults in key components such as insulators,which serve both mechanical and electrical functions.Even a single defectiv... Ensuring the reliability of power transmission networks depends heavily on the early detection of faults in key components such as insulators,which serve both mechanical and electrical functions.Even a single defective insulator can lead to equipment breakdown,costly service interruptions,and increased maintenance demands.While unmanned aerial vehicles(UAVs)enable rapid and cost-effective collection of high-resolution imagery,accurate defect identification remains challenging due to cluttered backgrounds,variable lighting,and the diverse appearance of faults.To address these issues,we introduce a real-time inspection framework that integrates an enhanced YOLOv10 detector with a Hybrid Quantum-Enhanced Graph Neural Network(HQGNN).The YOLOv10 module,fine-tuned on domainspecific UAV datasets,improves detection precision,while the HQGNN ensures multi-object tracking and temporal consistency across video frames.This synergy enables reliable and efficient identification of faulty insulators under complex environmental conditions.Experimental results show that the proposed YOLOv10-HQGNN model surpasses existing methods across all metrics,achieving Recall of 0.85 and Average Precision(AP)of 0.83,with clear gains in both accuracy and throughput.These advancements support automated,proactive maintenance strategies that minimize downtime and contribute to a safer,smarter energy infrastructure. 展开更多
关键词 Object detection GNN QGNN HQGNN QUANTUM YOLO power quality
在线阅读 下载PDF
A generation-based defect detection system for rail transit infrastructure
5
作者 Xinyu Zheng Lingfeng Zhang +1 位作者 Yuhao Luo Tiange Wang 《High-Speed Railway》 2026年第1期1-9,共9页
The use of Unmanned Aerial Vehicles(UAVs)for defect detection on railway slopes is becoming increasingly widespread due to their ability to capture high-resolution images over large,inaccessible,and topographically co... The use of Unmanned Aerial Vehicles(UAVs)for defect detection on railway slopes is becoming increasingly widespread due to their ability to capture high-resolution images over large,inaccessible,and topographically complex areas.However,current UAV-based detection methods face several critical limitations,including constrained deployment frequency,limited availability of annotated defect data,and the lack of mature risk assessment frameworks.To address these challenges,this study introduces a novel approach that integrates diffusion models with Large Language Models(LLMs)to generate highquality synthetic defect images tailored to railway slope scenarios.Furthermore,an improved transformerbased architecture is proposed,incorporating attention mechanisms and LLM-guided diffusion-generated imagery to enhance defect recognition performance under complex environmental conditions.Experimental evaluations conducted on a dataset of 300 field-collected images from high-risk railway slopes demonstrate that the proposed method significantly outperforms existing baselines in terms of precision,recall,and robustness,indicating strong applicability for real-world railway infrastructure monitoring and disaster prevention. 展开更多
关键词 RAILWAY Large language models Computer vision Object detection
在线阅读 下载PDF
Superpixel-Aware Transformer with Attention-Guided Boundary Refinement for Salient Object Detection
6
作者 Burhan Baraklı Can Yüzkollar +1 位作者 Tugrul Ta¸sçı Ibrahim Yıldırım 《Computer Modeling in Engineering & Sciences》 2026年第1期1092-1129,共38页
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task... Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner. 展开更多
关键词 Salient object detection superpixel segmentation TRANSFORMERS attention mechanism multi-level fusion edge-preserving refinement model-driven
在线阅读 下载PDF
CLF-YOLOv8:Lightweight Multi-Scale Fusion with Focal Geometric Loss for Real-Time Night Maritime Detection
7
作者 Zhonghao Wang Xin Liu +1 位作者 Changhua Yue Haiwen Yuan 《Computers, Materials & Continua》 2026年第2期1667-1689,共23页
To address critical challenges in nighttime ship detection—high small-target missed detection(over 20%),insufficient lightweighting,and limited generalization due to scarce,low-quality datasets—this study proposes a... To address critical challenges in nighttime ship detection—high small-target missed detection(over 20%),insufficient lightweighting,and limited generalization due to scarce,low-quality datasets—this study proposes a systematic solution.First,a high-quality Night-Ships dataset is constructed via CycleGAN-based day-night transfer,combined with a dual-threshold cleaning strategy(Laplacian variance sharpness filtering and brightness-color deviation screening).Second,a Cross-stage Lightweight Fusion-You Only Look Once version 8(CLF-YOLOv8)is proposed with key improvements:the Neck network is reconstructed by replacing Cross Stage Partial(CSP)structure with the Cross Stage Partial Multi-Scale Convolutional Block(CSP-MSCB)and integrating Bidirectional Feature Pyramid Network(BiFPN)for weighted multi-scale fusion to enhance small-target detection;a Lightweight Shared Convolutional and Separated Batch Normalization Detection-Head(LSCSBD-Head)with shared convolutions and layer-wise Batch Normalization(BN)reduces parameters to 1.8M(42% fewer than YOLOv8n);and the FocalMinimum Point Distance Intersection over Union(Focal-MPDIoU)loss combines Minimum Point Distance Intersection over Union(MPDIoU)geometric constraints and Focal weighting to optimize low-overlap targets.Experiments show CLFYOLOv8 achieves 97.6%mAP@0.5(0.7% higher than YOLOv8n)with 1.8 M parameters,outperforming mainstream models in small-target detection,overlapping target discrimination,and adaptability to complex lighting. 展开更多
关键词 Nighttime ship detection lightweight model small object detection BiFPN LSCSBD-Head Focal-MPDIoU YOLOv8
在线阅读 下载PDF
YOLO-SDW: Traffic Sign Detection Algorithm Based on YOLOv8s Skip Connection and Dynamic Convolution
8
作者 Qing Guo Juwei Zhang Bingyi Ren 《Computers, Materials & Continua》 2026年第1期1433-1452,共20页
Traffic sign detection is an important part of autonomous driving,and its recognition accuracy and speed are directly related to road traffic safety.Although convolutional neural networks(CNNs)have made certain breakt... Traffic sign detection is an important part of autonomous driving,and its recognition accuracy and speed are directly related to road traffic safety.Although convolutional neural networks(CNNs)have made certain breakthroughs in this field,in the face of complex scenes,such as image blur and target occlusion,the traffic sign detection continues to exhibit limited accuracy,accompanied by false positives and missed detections.To address the above problems,a traffic sign detection algorithm,You Only Look Once-based Skip Dynamic Way(YOLO-SDW)based on You Only Look Once version 8 small(YOLOv8s),is proposed.Firstly,a Skip Connection Reconstruction(SCR)module is introduced to efficiently integrate fine-grained feature information and enhance the detection accuracy of the algorithm in complex scenes.Secondly,a C2f module based on Dynamic Snake Convolution(C2f-DySnake)is proposed to dynamically adjust the receptive field information,improve the algorithm’s feature extraction ability for blurred or occluded targets,and reduce the occurrence of false detections and missed detections.Finally,the Wise Powerful IoU v2(WPIoUv2)loss function is proposed to further improve the detection accuracy of the algorithm.Experimental results show that the average precision mAP@0.5 of YOLO-SDW on the TT100K dataset is 89.2%,and mAP@0.5:0.95 is 68.5%,which is 4%and 3.3%higher than the YOLOv8s baseline,respectively.YOLO-SDW ensures real-time performance while having higher accuracy. 展开更多
关键词 Traffic sign detection YOLOv8 object detection deep learning
在线阅读 下载PDF
An Unsupervised Online Detection Method for Foreign Objects in Complex Environments
9
作者 YANG Xiaoyang YANG Yanzhu DENG Haiping 《Journal of Donghua University(English Edition)》 2026年第1期140-151,共12页
In modern industrial production,foreign object detection in complex environments is crucial to ensure product quality and production safety.Detection systems based on deep-learning image processing algorithms often fa... In modern industrial production,foreign object detection in complex environments is crucial to ensure product quality and production safety.Detection systems based on deep-learning image processing algorithms often face challenges with handling high-resolution images and achieving accurate detection against complex backgrounds.To address these issues,this study employs the PatchCore unsupervised anomaly detection algorithm combined with data augmentation techniques to enhance the system’s generalization capability across varying lighting conditions,viewing angles,and object scales.The proposed method is evaluated in a complex industrial detection scenario involving the bogie of an electric multiple unit(EMU).A dataset consisting of complex backgrounds,diverse lighting conditions,and multiple viewing angles is constructed to validate the performance of the detection system in real industrial environments.Experimental results show that the proposed model achieves an average area under the receiver operating characteristic curve(AUROC)of 0.92 and an average F1 score of 0.85.Combined with data augmentation,the proposed model exhibits improvements in AUROC by 0.06 and F1 score by 0.03,demonstrating enhanced accuracy and robustness for foreign object detection in complex industrial settings.In addition,the effects of key factors on detection performance are systematically analyzed,providing practical guidance for parameter selection in real industrial applications. 展开更多
关键词 foreign object detection unsupervised learning data augmentation complex environment BOGIE DATASET
在线阅读 下载PDF
EHDC-YOLO: Enhancing Object Detection for UAV Imagery via Multi-Scale Edge and Detail Capture
10
作者 Zhiyong Deng Yanchen Ye Jiangling Guo 《Computers, Materials & Continua》 2026年第1期1665-1682,共18页
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ... With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios. 展开更多
关键词 UAV imagery object detection multi-scale feature fusion edge enhancement detail preservation YOLO feature pyramid network attention mechanism
在线阅读 下载PDF
Deep Learning-Based Toolkit Inspection:Object Detection and Segmentation in Assembly Lines
11
作者 Arvind Mukundan Riya Karmakar +1 位作者 Devansh Gupta Hsiang-Chen Wang 《Computers, Materials & Continua》 2026年第1期1255-1277,共23页
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t... Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities. 展开更多
关键词 Tool detection image segmentation object detection assembly line automation Industry 4.0 Intel RealSense deep learning toolkit verification RGB-D imaging quality assurance
在线阅读 下载PDF
Lightweight Detection of Grape Inflorescences and Fruitlets using an Improved YOLOv8 Model
12
作者 Hu Guoyu Lin Zhe +1 位作者 Wang Haining Jiang Dexuan 《新疆大学学报(自然科学版中英文)》 2026年第2期129-143,共15页
Globally,grape cultivation spans vast areas and achieves substantial yields,making grapes and related industries vital economic pillars for many nations.In grape production,efficient and precise management during key ... Globally,grape cultivation spans vast areas and achieves substantial yields,making grapes and related industries vital economic pillars for many nations.In grape production,efficient and precise management during key growth stages is essential for enhancing both yield and quality.In view of the problems that during the grape inflorescences and young fruits stage,the targets are small in size,easily obscured by branches and leaves,and highly similar in color to the background,resulting in poor recognition performance of existing detection methods in complex natural environments,which in turn restricts the application of precision spraying technology.This paper establishes a dedicated dataset for grape inflorescences and young fruits in Xinjiang and proposes an improved lightweight detection model,YOLOv8-FCD.The model incorporates a PConv-based C2f_Faster module to reduce parameter count and computational complexity,replaces the original upsampling method with the CARAFE module to enhance feature extraction capability,and introduces the Detect_SEAM detection head to improve recognition accuracy under occlusion and small-target conditions.Experimental results show that the YOLOv8-FCD model achieves a detection precision(P)of 93.7%and a recall(R)of 87.3%,with a mean average precision(mAP)of 94.6%.Compared to the original YOLOv8n model,P improved by 8.2%,mAP increased by 2.6%,and the model size is reduced to 85.71%of the original.This model provides effective technical support for the identification of grape inflorescences and young fruits in intelligent spraying for plant protection. 展开更多
关键词 image processing deep learning object detection GRAPE YOLOv8
在线阅读 下载PDF
Improving Real-Time Animal Detection Using Group Sparsity in YOLOv8:A Solution for Animal-Toy Differentiation
13
作者 Zia Ur Rehman Ahmad Syed +3 位作者 Abu Tayab Ghanshyam G.Tejani Doaa Sami Khafaga El-Sayed M.El-kenawy 《Computers, Materials & Continua》 2026年第2期1726-1750,共25页
Object detection,a major challenge in computer vision and pattern recognition,plays a significant part in many applications,crossing artificial intelligence,face recognition,and autonomous driving.It involves focusing... Object detection,a major challenge in computer vision and pattern recognition,plays a significant part in many applications,crossing artificial intelligence,face recognition,and autonomous driving.It involves focusing on identifying the detection,localization,and categorization of targets in images.A particularly important emerging task is distinguishing real animals from toy replicas in real-time,mostly for smart camera systems in both urban and natural environments.However,that difficult task is affected by factors such as showing angle,occlusion,light intensity,variations,and texture differences.To tackle these challenges,this paper recommends Group Sparse YOLOv8(You Only Look Once version 8),an improved real-time object detection algorithm that improves YOLOv8 by integrating group sparsity regularization.This adjustment improves efficiency and accuracy while utilizing the computational costs and power consumption,including a frame selection approach.And a hybrid parallel processing method that merges pipelining with dataflow strategies to improve the performance.Established using a custom dataset of toy and real animal images along with well-known datasets,namely ImageNet,MSCOCO,and CIFAR-10/100.The combination of Group Sparsity with YOLOv8 shows high detection accuracy with lower latency.Here provides a real and resource-efficient solution for intelligent camera systems and improves real-time object detection and classification in environments,differentiating between real and toy animals. 展开更多
关键词 YOLOv8 SPARSITY group sparsity group sparse representation(GSR) CNNS object detection
在线阅读 下载PDF
ES-YOLO:Edge and Shape Fusion-Based YOLO for Tra.c Sign Detection
14
作者 Weiguo Pan Songjie Du +2 位作者 Bingxin Xu Bin Zhang Hongzhe Liu 《Computers, Materials & Continua》 2026年第4期2127-2145,共19页
Traffic sign detection is a critical component of driving systems.Single-stage network-based traffic sign detection algorithms,renowned for their fast detection speeds and high accuracy,have become the dominant approa... Traffic sign detection is a critical component of driving systems.Single-stage network-based traffic sign detection algorithms,renowned for their fast detection speeds and high accuracy,have become the dominant approach in current practices.However,in complex and dynamic traffic scenes,particularly with smaller traffic sign objects,challenges such as missed and false detections can lead to reduced overall detection accuracy.To address this issue,this paper proposes a detection algorithm that integrates edge and shape information.Recognizing that traffic signs have specific shapes and distinct edge contours,this paper introduces an edge feature extraction branch within the backbone network,enabling adaptive fusion with features of the same hierarchical level.Additionally,a shape prior convolution module is designed to replaces the first two convolutional modules of the backbone network,aimed at enhancing the model's perception ability for specific shape objects and reducing its sensitivity to background noise.The algorithm was evaluated on the CCTSDB and TT100k datasets,and compared to YOLOv8s,the mAP50 values increased by 3.0%and 10.4%,respectively,demonstrating the effectiveness of the proposed method in improving the accuracy of traffic sign detection. 展开更多
关键词 Traffic sign edge information shape prior feature fusion object detection
在线阅读 下载PDF
A Comprehensive Literature Review on YOLO-Based Small Object Detection:Methods,Challenges,and Future Trends
15
作者 Hui Yu Jun Liu Mingwei Lin 《Computers, Materials & Continua》 2026年第4期258-309,共52页
Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of... Small object detection has been a focus of attention since the emergence of deep learning-based object detection.Although classical object detection frameworks have made significant contributions to the development of object detection,there are still many issues to be resolved in detecting small objects due to the inherent complexity and diversity of real-world visual scenes.In particular,the YOLO(You Only Look Once)series of detection models,renowned for their real-time performance,have undergone numerous adaptations aimed at improving the detection of small targets.In this survey,we summarize the state-of-the-art YOLO-based small object detection methods.This review presents a systematic categorization of YOLO-based approaches for small-object detection,organized into four methodological avenues,namely attention-based feature enhancement,detection-head optimization,loss function,and multi-scale feature fusion strategies.We then examine the principal challenges addressed by each category.Finally,we analyze the performance of thesemethods on public benchmarks and,by comparing current approaches,identify limitations and outline directions for future research. 展开更多
关键词 Small object detection YOLO real-time detection feature fusion deep learning
在线阅读 下载PDF
Steel Surface Defect Detection via the Multiscale Edge Enhancement Method
16
作者 Yuanyuan Wang Yemeng Zhu +2 位作者 Xiuchuan Chen Tongtong Yin Shiwei Su 《Computers, Materials & Continua》 2026年第3期1006-1032,共27页
To solve the false detection and missed detection problems caused by various types and sizes of defects in the detection of steel surface defects,similar defects and background features,and similarities between differ... To solve the false detection and missed detection problems caused by various types and sizes of defects in the detection of steel surface defects,similar defects and background features,and similarities between different defects,this paper proposes a lightweight detection model named multiscale edge and squeeze-and-excitation attention detection network(MSESE),which is built upon the You Only Look Once version 11 nano(YOLOv11n).To address the difficulty of locating defect edges,we first propose an edge enhancement module(EEM),apply it to the process of multiscale feature extraction,and then propose a multiscale edge enhancement module(MSEEM).By obtaining defect features from different scales and enhancing their edge contours,the module uses the dual-domain selection mechanism to effectively focus on the important areas in the image to ensure that the feature images have richer information and clearer contour features.By fusing the squeeze-and-excitation attention mechanism with the EEM,we obtain a lighter module that can enhance the representation of edge features,which is named the edge enhancement module with squeeze-and-excitation attention(EEMSE).This module was subsequently integrated into the detection head.The enhanced detection head achieves improved edge feature enhancement with reduced computational overhead,while effectively adjusting channel-wise importance and further refining feature representation.Experiments on the NEU-DET dataset show that,compared with the original YOLOv11n,the improved model achieves improvements of 4.1%and 2.2%in terms of mAP@0.5 and mAP@0.5:0.95,respectively,and the GFLOPs value decreases from the original value of 6.4 to 6.2.Furthermore,when compared to current mainstream models,Mamba-YOLOT and RTDETR-R34,our method achieves superior performance with 6.5%and 8.9%higher mAP@0.5,respectively,while maintaining a more compact parameter footprint.These results collectively validate the effectiveness and efficiency of our proposed approach. 展开更多
关键词 Steel defects object detection algorithms small target multiscale attention mechanism
在线阅读 下载PDF
Enhanced Multi-Scale Feature Extraction Lightweight Network for Remote Sensing Object Detection
17
作者 Xiang Luo Yuxuan Peng +2 位作者 Renghong Xie Peng Li Yuwen Qian 《Computers, Materials & Continua》 2026年第3期2097-2118,共22页
Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targ... Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016). 展开更多
关键词 Deep learning object detection feature extraction feature fusion remote sensing
在线阅读 下载PDF
AdvYOLO:An Improved Cross-Conv-Block Feature Fusion-Based YOLO Network for Transferable Adversarial Attacks on ORSIs Object Detection
18
作者 Leyu Dai Jindong Wang +2 位作者 Ming Zhou Song Guo Hengwei Zhang 《Computers, Materials & Continua》 2026年第4期767-792,共26页
In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free... In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions. 展开更多
关键词 Remote sensing object detection transferable adversarial attack feature fusion cross-conv-block
在线阅读 下载PDF
A Comparative Benchmark of Deep Learning Architectures for AI-Assisted Breast Cancer Detection in Mammography Using the MammosighTR Dataset:A Nationwide Turkish Screening Study(2016–2022)
19
作者 Nuh Azginoglu 《Computer Modeling in Engineering & Sciences》 2026年第1期1151-1173,共23页
Breast cancer screening programs rely heavily on mammography for early detection;however,diagnostic performance is strongly affected by inter-reader variability,breast density,and the limitations of conven-tional comp... Breast cancer screening programs rely heavily on mammography for early detection;however,diagnostic performance is strongly affected by inter-reader variability,breast density,and the limitations of conven-tional computer-aided detection systems.Recent advances in deep learning have enabled more robust and scalable solutions for large-scale screening,yet a systematic comparison of modern object detection architectures on nationally representative datasets remains limited.This study presents a comprehensive quantitative comparison of prominent deep learning–based object detection architectures for Artificial Intelligence-assisted mammography analysis using the MammosighTR dataset,developed within the Turkish National Breast Cancer Screening Program.The dataset comprises 12,740 patient cases collected between 2016 and 2022,annotated with BI-RADS categories,breast density levels,and lesion localization labels.A total of 31 models were evaluated,including One-Stage,Two-Stage,and Transformer-based architectures,under a unified experimental framework at both patient and breast levels.The results demonstrate that Two-Stage architectures consistently outperform One-Stage models,achieving approximately 2%–4%higher Macro F1-Scores and more balanced precision–recall trade-offs,with Double-Head R-CNN and Dynamic R-CNN yielding the highest overall performance(Macro F1≈0.84–0.86).This advantage is primarily attributed to the region proposal mechanism and improved class balance inherent to Two-Stage designs.One-Stage detectors exhibited higher sensitivity and faster inference,reaching Recall values above 0.88,but experienced minor reductions in Precision and overall accuracy(≈1%–2%)compared with Two-Stage models.Among Transformer-based architectures,Deformable DEtection TRansformer demonstrated strong robustness and consistency across datasets,achieving Macro F1-Scores comparable to CNN-based detectors(≈0.83–0.85)while exhibiting minimal performance degradation under distributional shifts.Breast density–based analysis revealed increased misclassification rates in medium-density categories(types B and C),whereas Transformer-based architectures maintained more stable performance in high-density type D tissue.These findings quantitatively confirm that both architectural design and tissue characteristics play a decisive role in diagnostic accuracy.Overall,the study provides a reproducible benchmark and highlights the potential of hybrid approaches that combine the accuracy of Two-Stage detectors with the contextual modeling capability of Transformer architectures for clinically reliable breast cancer screening systems. 展开更多
关键词 Deep learning MAMMOGRAPHY breast cancer detection object detection BI-RADS classification
在线阅读 下载PDF
A Hybrid Deep Learning Approach for Real-Time Cheating Behaviour Detection in Online Exams Using Video Captured Analysis
20
作者 Dao Phuc Minh Huy Gia Nhu Nguyen Dac-Nhuong Le 《Computers, Materials & Continua》 2026年第3期1179-1198,共20页
Online examinations have become a dominant assessment mode,increasing concerns over academic integrity.To address the critical challenge of detecting cheating behaviours,this study proposes a hybrid deep learning appr... Online examinations have become a dominant assessment mode,increasing concerns over academic integrity.To address the critical challenge of detecting cheating behaviours,this study proposes a hybrid deep learning approach that combines visual detection and temporal behaviour classification.The methodology utilises object detection models—You Only Look Once(YOLOv12),Faster Region-based Convolutional Neural Network(RCNN),and Single Shot Detector(SSD)MobileNet—integrated with classification models such as Convolutional Neural Networks(CNN),Bidirectional Gated Recurrent Unit(Bi-GRU),and CNN-LSTM(Long Short-Term Memory).Two distinct datasets were used:the Online Exam Proctoring(EOP)dataset from Michigan State University and the School of Computer Science,Duy Tan Unievrsity(SCS-DTU)dataset collected in a controlled classroom setting.A diverse set of cheating behaviours,including book usage,unauthorised interaction,internet access,and mobile phone use,was categorised.Comprehensive experiments evaluated the models based on accuracy,precision,recall,training time,inference speed,and memory usage.We evaluate nine detector-classifier pairings under a unified budget and score them via a calibrated harmonic mean of detection and classification accuracies,enabling deployment-oriented selection under latency and memory constraints.Macro-Precision/Recall/F1 and Receiver Operating Characteristic-Area Under the Curve(ROC-AUC)are reported for the top configurations,revealing consistent advantages of object-centric pipelines for fine-grained cheating cues.The highest overall score is achieved by YOLOv12+CNN(97.15%accuracy),while SSD-MobileNet+CNN provides the best speed-efficiency trade-off for edge devices.This research provides valuable insights into selecting and deploying appropriate deep learning models for maintaining exam integrity under varying resource constraints. 展开更多
关键词 Online exam proctoring cheating behavior detection deep learning real-time monitoring object detection human behavior recognition
在线阅读 下载PDF
上一页 1 2 30 下一页 到第
使用帮助 返回顶部