期刊文献+
共找到10,733篇文章
< 1 2 250 >
每页显示 20 50 100
Enhanced Multi-Scale Object Detection Algorithm for Foggy Traffic Scenarios
1
作者 Honglin Wang Zitong Shi Cheng Zhu 《Computers, Materials & Continua》 2025年第2期2451-2474,共24页
In foggy traffic scenarios, existing object detection algorithms face challenges such as low detection accuracy, poor robustness, occlusion, missed detections, and false detections. To address this issue, a multi-scal... In foggy traffic scenarios, existing object detection algorithms face challenges such as low detection accuracy, poor robustness, occlusion, missed detections, and false detections. To address this issue, a multi-scale object detection algorithm based on an improved YOLOv8 has been proposed. Firstly, a lightweight attention mechanism, Triplet Attention, is introduced to enhance the algorithm’s ability to extract multi-dimensional and multi-scale features, thereby improving the receptive capability of the feature maps. Secondly, the Diverse Branch Block (DBB) is integrated into the CSP Bottleneck with two Convolutions (C2F) module to strengthen the fusion of semantic information across different layers. Thirdly, a new decoupled detection head is proposed by redesigning the original network head based on the Diverse Branch Block module to improve detection accuracy and reduce missed and false detections. Finally, the Minimum Point Distance based Intersection-over-Union (MPDIoU) is used to replace the original YOLOv8 Complete Intersection-over-Union (CIoU) to accelerate the network’s training convergence. Comparative experiments and dehazing pre-processing tests were conducted on the RTTS and VOC-Fog datasets. Compared to the baseline YOLOv8 model, the improved algorithm achieved mean Average Precision (mAP) improvements of 4.6% and 3.8%, respectively. After defogging pre-processing, the mAP increased by 5.3% and 4.4%, respectively. The experimental results demonstrate that the improved algorithm exhibits high practicality and effectiveness in foggy traffic scenarios. 展开更多
关键词 Deep learning object detection foggy scenes traffic detection YOLOv8
在线阅读 下载PDF
Multi-scale object detection by top-down and bottom-up feature pyramid network 被引量:14
2
作者 ZHAO Baojun ZHAO Boya +2 位作者 TANG Linbo WANG Wenzheng WU Chen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期1-12,共12页
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ... While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps. 展开更多
关键词 convolutional neural NETWORK (CNN) FEATURE PYRAMID NETWORK (FPN) object detection deconvolution.
在线阅读 下载PDF
B-PesNet: Smoothly Propagating Semantics for Robust and Reliable Multi-Scale Object Detection for Secure Systems
3
作者 Yunbo Rao Hongyu Mu +4 位作者 Zeyu Yang Weibin Zheng Faxin Wang Jiansu Pu Shaoning Zeng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2022年第9期1039-1054,共16页
Multi-scale object detection is a research hotspot,and it has critical applications in many secure systems.Although the object detection algorithms have constantly been progressing recently,how to perform highly accur... Multi-scale object detection is a research hotspot,and it has critical applications in many secure systems.Although the object detection algorithms have constantly been progressing recently,how to perform highly accurate and reliable multi-class object detection is still a challenging task due to the influence of many factors,such as the deformation and occlusion of the object in the actual scene.The more interference factors,the more complicated the semantic information,so we need a deeper network to extract deep information.However,deep neural networks often suffer from network degradation.To prevent the occurrence of degradation on deep neural networks,we put forth a new model using a newly-designed Pre-ReLU,which inserts a ReLU layer before the convolution layer for the sake of preventing network degradation and ensuring the performance of deep networks.This structure can transfer the semantic information more smoothly from the shallow to the deep layer.However,the deep networks will encounter not only degradation,but also a decline in efficiency.Therefore,to speed up the two-stage detector,we divide the feature map into many groups so as to diminish the number of parameters.Correspondingly,calculation speed has been enhanced,achieving a balance between speed and accuracy.Through mathematical demonstration,a Balanced Loss(BL)is proposed by a balance factor to decrease the weight of the negative sample during the training phase to balance the positives and negatives.Finally,our detector demonstrates rosy results in a range of experiments and gains an mAP of 73.38 on PASCAL VOC2007,which approaches the requirement of many security systems. 展开更多
关键词 object detection Pre-ReLU CNN Balanced loss
在线阅读 下载PDF
Improved YOLO algorithm based on multi-scale object detection in haze weather scenarios
4
作者 Junqing Shi Sui Ruan +4 位作者 Yanhong Tao Yingxu Rui Jun Deng Peng Liao Peng Mei 《Chain》 2025年第2期183-197,共15页
Computer vision-based traffic object detection plays a critical role in road traffic safety.Under hazy weather conditions,images captured by road monitoring systems exhibit three main challenges:significant scale vari... Computer vision-based traffic object detection plays a critical role in road traffic safety.Under hazy weather conditions,images captured by road monitoring systems exhibit three main challenges:significant scale variations,abundant background noise,and diverse perspectives.These factors lead to insufficient detection accuracy and limited real-time performance in object detection algorithms.We propose AMC-YOLO an improved YOLOv11-based traffic detection algorithm to address these challenges.In this work,we replace the C3k block's bottleneck module with our novel attention-gate convolution(AGConv),which improves contextual information capture,enhances feature extraction,and reduces computational redundancy.Additionally,we introduce the multi-dilation sharing convolution(MDSC)module to prevent feature information loss during pooling operations,enhancing the model's sensitivity to multi-scale features.We design a lightweight and efficient cross-channel feature fusion module(CCFM)for the path aggregation neck to adaptively adjust feature weights and optimize the model's overall performance.Experimental results demonstrate that AMC-YOLO achieves a 1.1%improvement in mAP@0.5 and a 2.7%increase in mAP@0.5:0.95 compared to YOLOv11n.On graphics processing unit(GPU)hardware,it achieves real-time performance at 376(FPS)with only 2.6 million parameters,ensuring high-precision traffic detection while meeting deployment requirements on resource-constrained devices. 展开更多
关键词 convolutional network object detection self-attention mechanism YOLO algorithm
原文传递
FMCSNet: Mobile Devices-Oriented Lightweight Multi-Scale Object Detection via Fast Multi-Scale Channel Shuffling Network Model
5
作者 Lijuan Huang Xianyi Liu +1 位作者 Jinping Liu Pengfei Xu 《Computers, Materials & Continua》 2026年第1期1292-1311,共20页
The ubiquity of mobile devices has driven advancements in mobile object detection.However,challenges in multi-scale object detection in open,complex environments persist due to limited computational resources.Traditio... The ubiquity of mobile devices has driven advancements in mobile object detection.However,challenges in multi-scale object detection in open,complex environments persist due to limited computational resources.Traditional approaches like network compression,quantization,and lightweight design often sacrifice accuracy or feature representation robustness.This article introduces the Fast Multi-scale Channel Shuffling Network(FMCSNet),a novel lightweight detection model optimized for mobile devices.FMCSNet integrates a fully convolutional Multilayer Perceptron(MLP)module,offering global perception without significantly increasing parameters,effectively bridging the gap between CNNs and Vision Transformers.FMCSNet achieves a delicate balance between computation and accuracy mainly by two key modules:the ShiftMLP module,including a shift operation and an MLP module,and a Partial group Convolutional(PGConv)module,reducing computation while enhancing information exchange between channels.With a computational complexity of 1.4G FLOPs and 1.3M parameters,FMCSNet outperforms CNN-based and DWConv-based ShuffleNetv2 by 1%and 4.5%mAP on the Pascal VOC 2007 dataset,respectively.Additionally,FMCSNet achieves a mAP of 30.0(0.5:0.95 IoU threshold)with only 2.5G FLOPs and 2.0M parameters.It achieves 32 FPS on low-performance i5-series CPUs,meeting real-time detection requirements.The versatility of the PGConv module’s adaptability across scenarios further highlights FMCSNet as a promising solution for real-time mobile object detection. 展开更多
关键词 object detection lightweight network partial group convolution multilayer perceptron
在线阅读 下载PDF
MGD-YOLO:An Enhanced Road Defect Detection Algorithm Based on Multi-Scale Attention Feature Fusion
6
作者 Zhengji Li Fazhan Xiong +6 位作者 Boyun Huang Meihui Li Xi Xiao Yingrui Ji Jiacheng Xie Aokun Liang Hao Xu 《Computers, Materials & Continua》 2025年第9期5613-5635,共23页
Accurate and real-time road defect detection is essential for ensuring traffic safety and infrastructure maintenance.However,existing vision-based methods often struggle with small,sparse,and low-resolution defects un... Accurate and real-time road defect detection is essential for ensuring traffic safety and infrastructure maintenance.However,existing vision-based methods often struggle with small,sparse,and low-resolution defects under complex road conditions.To address these limitations,we propose Multi-Scale Guided Detection YOLO(MGD-YOLO),a novel lightweight and high-performance object detector built upon You Only Look Once Version 5(YOLOv5).The proposed model integrates three key components:(1)a Multi-Scale Dilated Attention(MSDA)module to enhance semantic feature extraction across varying receptive fields;(2)Depthwise Separable Convolution(DSC)to reduce computational cost and improve model generalization;and(3)a Visual Global Attention Upsampling(VGAU)module that leverages high-level contextual information to refine low-level features for precise localization.Extensive experiments on three public road defect benchmarks demonstrate that MGD-YOLO outperforms state-of-the-art models in both detection accuracy and efficiency.Notably,our model achieves 87.9%accuracy in crack detection,88.3%overall precision on TD-RD dataset,while maintaining fast inference speed and a compact architecture.These results highlight the potential of MGD-YOLO for deployment in real-time,resource-constrained scenarios,paving the way for practical and scalable intelligent road maintenance systems. 展开更多
关键词 YOLO road damage detection object detection computer vision deep learning
在线阅读 下载PDF
Multi-Scale Feature Fusion Network for Accurate Detection of Cervical Abnormal Cells
7
作者 Chuanyun Xu Die Hu +3 位作者 Yang Zhang Shuaiye Huang Yisha Sun Gang Li 《Computers, Materials & Continua》 2025年第4期559-574,共16页
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an... Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening. 展开更多
关键词 Cervical abnormal cells image detection multi-scale feature fusion contextual information
在线阅读 下载PDF
Fake News Detection Based on Cross-Modal Ambiguity Computation and Multi-Scale Feature Fusion
8
作者 Jianxiang Cao Jinyang Wu +5 位作者 Wenqian Shang Chunhua Wang Kang Song Tong Yi Jiajun Cai Haibin Zhu 《Computers, Materials & Continua》 2025年第5期2659-2675,共17页
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of... With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection. 展开更多
关键词 Fake news detection MULTIMODAL cross-modal ambiguity computation multi-scale feature fusion
在线阅读 下载PDF
Transmission Facility Detection with Feature-Attention Multi-Scale Robustness Network and Generative Adversarial Network
9
作者 Yunho Na Munsu Jeon +4 位作者 Seungmin Joo Junsoo Kim Ki-Yong Oh Min Ku Kim Joon-Young Park 《Computer Modeling in Engineering & Sciences》 2025年第7期1013-1044,共32页
This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits thre... This paper proposes an automated detection framework for transmission facilities using a featureattention multi-scale robustness network(FAMSR-Net)with high-fidelity virtual images.The proposed framework exhibits three key characteristics.First,virtual images of the transmission facilities generated using StyleGAN2-ADA are co-trained with real images.This enables the neural network to learn various features of transmission facilities to improve the detection performance.Second,the convolutional block attention module is deployed in FAMSR-Net to effectively extract features from images and construct multi-dimensional feature maps,enabling the neural network to perform precise object detection in various environments.Third,an effective bounding box optimization method called Scylla-IoU is deployed on FAMSR-Net,considering the intersection over union,center point distance,angle,and shape of the bounding box.This enables the detection of power facilities of various sizes accurately.Extensive experiments demonstrated that FAMSRNet outperforms other neural networks in detecting power facilities.FAMSR-Net also achieved the highest detection accuracy when virtual images of the transmission facilities were co-trained in the training phase.The proposed framework is effective for the scheduled operation and maintenance of transmission facilities because an optical camera is currently the most promising tool for unmanned aerial vehicles.This ultimately contributes to improved inspection efficiency,reduced maintenance risks,and more reliable power delivery across extensive transmission facilities. 展开更多
关键词 object detection virtual image transmission facility convolutional block attention module Scylla-IoU
在线阅读 下载PDF
Optimized Convolutional Neural Networks with Multi-Scale Pyramid Feature Integration for Efficient Traffic Light Detection in Intelligent Transportation Systems
10
作者 Yahia Said Yahya Alassaf +2 位作者 Refka Ghodhbani Taoufik Saidani Olfa Ben Rhaiem 《Computers, Materials & Continua》 2025年第2期3005-3018,共14页
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio... Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks. 展开更多
关键词 Intelligent transportation systems(ITS) traffic light detection multi-scale pyramid feature maps advanced driver assistance systems(ADAS) real-time detection AI in transportation
在线阅读 下载PDF
Hybrid receptive field network for small object detection on drone view
11
作者 Zhaodong CHEN Hongbing JI +2 位作者 Yongquan ZHANG Wenke LIU Zhigang ZHU 《Chinese Journal of Aeronautics》 2025年第2期322-338,共17页
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones... Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built. 展开更多
关键词 Drone remote sensing object detection on drone view Small object detector Hybrid receptive field Feature pyramid network Feature augmentation multi-scale object detection
原文传递
Coupling the Power of YOLOv9 with Transformer for Small Object Detection in Remote-Sensing Images
12
作者 Mohammad Barr 《Computer Modeling in Engineering & Sciences》 2025年第4期593-616,共24页
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen... Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images. 展开更多
关键词 Remote sensing images YOLOv9-TH multi-scale object detection transformer heads VisDrone2021 dataset
在线阅读 下载PDF
YOLO-MFD:Remote Sensing Image Object Detection with Multi-Scale Fusion Dynamic Head
13
作者 Zhongyuan Zhang Wenqiu Zhu 《Computers, Materials & Continua》 SCIE EI 2024年第5期2547-2563,共17页
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false... Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method. 展开更多
关键词 object detection YOLOv8 multi-scale attention mechanism dynamic detection head
在线阅读 下载PDF
MSC-YOLO:Improved YOLOv7 Based on Multi-Scale Spatial Context for Small Object Detection in UAV-View
14
作者 Xiangyan Tang Chengchun Ruan +2 位作者 Xiulai Li Binbin Li Cebin Fu 《Computers, Materials & Continua》 SCIE EI 2024年第4期983-1003,共21页
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati... Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications. 展开更多
关键词 Small object detection YOLOv7 multi-scale attention spatial context
在线阅读 下载PDF
GFRF R-CNN:Object Detection Algorithm for Transmission Lines
15
作者 Xunguang Yan Wenrui Wang +3 位作者 Fanglin Lu Hongyong Fan Bo Wu Jianfeng Yu 《Computers, Materials & Continua》 SCIE EI 2025年第1期1439-1458,共20页
To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-cap... To maintain the reliability of power systems,routine inspections using drones equipped with advanced object detection algorithms are essential for preempting power-related issues.The increasing resolution of drone-captured images has posed a challenge for traditional target detection methods,especially in identifying small objects in high-resolution images.This study presents an enhanced object detection algorithm based on the Faster Regionbased Convolutional Neural Network(Faster R-CNN)framework,specifically tailored for detecting small-scale electrical components like insulators,shock hammers,and screws in transmission line.The algorithm features an improved backbone network for Faster R-CNN,which significantly boosts the feature extraction network’s ability to detect fine details.The Region Proposal Network is optimized using a method of guided feature refinement(GFR),which achieves a balance between accuracy and speed.The incorporation of Generalized Intersection over Union(GIOU)and Region of Interest(ROI)Align further refines themodel’s accuracy.Experimental results demonstrate a notable improvement in mean Average Precision,reaching 89.3%,an 11.1%increase compared to the standard Faster R-CNN.This highlights the effectiveness of the proposed algorithm in identifying electrical components in high-resolution aerial images. 展开更多
关键词 Faster R-CNN transmission line object detection GIOU GFR
在线阅读 下载PDF
DDFNet:real-time salient object detection with dual-branch decoding fusion for steel plate surface defects
16
作者 Tao Wang Wang-zhe Du +5 位作者 Xu-wei Li Hua-xin Liu Yuan-ming Liu Xiao-miao Niu Ya-xing Liu Tao Wang 《Journal of Iron and Steel Research International》 2025年第8期2421-2433,共13页
A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod... A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet. 展开更多
关键词 Steel plate surface defect Real-time detection Salient object detection Dual-branch decoder multi-scale attention fusion multi-scale residual fusion
原文传递
DI-YOLOv5:An Improved Dual-Wavelet-Based YOLOv5 for Dense Small Object Detection
17
作者 Zi-Xin Li Yu-Long Wang Fei Wang 《IEEE/CAA Journal of Automatica Sinica》 2025年第2期457-459,共3页
Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dens... Dear Editor,This letter focuses on the fact that small objects with few pixels disappear in feature maps with large receptive fields, as the network deepens, in object detection tasks. Therefore, the detection of dense small objects is challenging. 展开更多
关键词 small objects receptive fields feature maps detection dense small objects object detection dense objects
在线阅读 下载PDF
PF-YOLO:An Improved YOLOv8 for Small Object Detection in Fisheye Images
18
作者 Cheng Zhang Cheng Xu Hongzhe Liu 《Journal of Beijing Institute of Technology》 2025年第1期57-70,共14页
Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objec... Top-view fisheye cameras are widely used in personnel surveillance for their broad field of view,but their unique imaging characteristics pose challenges like distortion,complex scenes,scale variations,and small objects near image edges.To tackle these,we proposed peripheral focus you only look once(PF-YOLO),an enhanced YOLOv8n-based method.Firstly,we introduced a cutting-patch data augmentation strategy to mitigate the problem of insufficient small-object samples in various scenes.Secondly,to enhance the model's focus on small objects near the edges,we designed the peripheral focus loss,which uses dynamic focus coefficients to provide greater gradient gains for these objects,improving their regression accuracy.Finally,we designed the three dimensional(3D)spatial-channel coordinate attention C2f module,enhancing spatial and channel perception,suppressing noise,and improving personnel detection.Experimental results demonstrate that PF-YOLO achieves strong performance on the challenging events for person detection from overhead fisheye images(CEPDTOF)and in-the-wild events for people detection and tracking from overhead fisheye cameras(WEPDTOF)datasets.Compared to the original YOLOv8n model,PFYOLO achieves improvements on CEPDTOF with increases of 2.1%,1.7%and 2.9%in mean average precision 50(mAP 50),mAP 50-95,and tively.On WEPDTOF,PF-YOLO achieves substantial improvements with increases of 31.4%,14.9%,61.1%and 21.0%in 91.2%and 57.2%,respectively. 展开更多
关键词 FISHEYE object detection and recognition small object detection deep learning
在线阅读 下载PDF
Point-voxel dual transformer for LiDAR 3D object detection
19
作者 TONG Jigang YANG Fanhang +1 位作者 YANG Sen DU Shengzhi 《Optoelectronics Letters》 2025年第9期547-554,共8页
In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the propos... In this paper,a two-stage light detection and ranging(LiDAR) three-dimensional(3D) object detection framework is presented,namely point-voxel dual transformer(PV-DT3D),which is a transformer-based method.In the proposed PV-DT3D,point-voxel fusion features are used for proposal refinement.Specifically,keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module.Subsequently,following the generation of proposals by the region proposal networks(RPN),the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture.In 3D object detection,the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions.Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods. 展开更多
关键词 proposal refinement encode representative scene features point voxel dual transformer object detection LIDAR d object detection generation proposals proposal refinementspecificallykeypoints
原文传递
Deep Learning-Based Algorithm for Robust Object Detection in Flooded and Rainy Environments
20
作者 Pengfei Wang Jiwu Sun +4 位作者 Lu Lu Hongchen Li Hongzhe Liu Cheng Xu Yongqiang Liu 《Computers, Materials & Continua》 2025年第8期2883-2903,共21页
Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interfe... Flooding and heavy rainfall under extreme weather conditions pose significant challenges to target detection algorithms.Traditional methods often struggle to address issues such as image blurring,dynamic noise interference,and variations in target scale.Conventional neural network(CNN)-based target detection approaches face notable limitations in such adverse weather scenarios,primarily due to the fixed geometric sampling structures that hinder adaptability to complex backgrounds and dynamically changing object appearances.To address these challenges,this paper proposes an optimized YOLOv9 model incorporating an improved deformable convolutional network(DCN)enhanced with a multi-scale dilated attention(MSDA)mechanism.Specifically,the DCN module enhances themodel’s adaptability to target deformation and noise interference by adaptively adjusting the sampling grid positions,while also integrating feature amplitude modulation to further improve robustness.Additionally,theMSDA module is introduced to capture contextual features acrossmultiple scales,effectively addressing issues related to target occlusion and scale variation commonly encountered in flood-affected environments.Experimental evaluations are conducted on the ISE-UFDS and UA-DETRAC datasets.The results demonstrate that the proposedmodel significantly outperforms state-of-the-art methods in key evaluation metrics,including precision,recall,F1-score,and mAP(Mean Average Precision).Notably,the model exhibits superior robustness and generalization performance under simulated severe weather conditions,offering reliable technical support for disaster emergency response systems.This study contributes to enhancing the accuracy and real-time capabilities of flood early warning systems,thereby supporting more effective disaster mitigation strategies. 展开更多
关键词 YOLO vehicle detection FLOOD deformable convolutional networks multi-scale dilated attention
暂未订购
上一页 1 2 250 下一页 到第
使用帮助 返回顶部