期刊文献+
共找到723篇文章
< 1 2 37 >
每页显示 20 50 100
Feature pyramid attention network for audio-visual scene classification 被引量:1
1
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Hyperspectral Satellite Image Classification Based on Feature Pyramid Networks With 3D Convolution
2
作者 CHEN Cheng PENG Pan +1 位作者 TAO Wei ZHAO Hui 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1073-1084,共12页
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N... Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods. 展开更多
关键词 hyperspectral image(HSI) deep learning feature pyramid network(fpn) spectral-spatial feature extraction
原文传递
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
3
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
基于改进Faster R-CNN-FPN的田间劳作行为目标检测算法
4
作者 周艳青 邹铭鑫 +2 位作者 姜新华 白洁 马学磊 《内蒙古农业大学学报(自然科学版)》 北大核心 2026年第1期77-86,共10页
劳作行为检测时存在着检测精度不高和漏检等问题,利用Faster R-CNN和FPN提出一种改进的劳作行为检测模型。首先,在Faster R-CNN框架基础上,引入特征金字塔网络FPN,用于提高较小目标的检测能力。然后,为提高模型对不同尺度目标的泛化能力... 劳作行为检测时存在着检测精度不高和漏检等问题,利用Faster R-CNN和FPN提出一种改进的劳作行为检测模型。首先,在Faster R-CNN框架基础上,引入特征金字塔网络FPN,用于提高较小目标的检测能力。然后,为提高模型对不同尺度目标的泛化能力,加入多尺度MS训练;并利用内容感知特征重组CARAFE上采样算子替换FPN中的双线性插值上采样方式,实现大范围内像素的关联。最后,在自建的数据集FWBD上对改进的Faster R-CNN-FPN检测模型进行训练和测试。结果表明:(1)与YOLOv3模型相比,改进的劳作行为识别算法mAP为69.40%;(2)与原始模型Faster、Faster-CARAFER、Faster-MS相比,改进的算法模型mAP值最高,达到了71.05%,说明改进的算法模型能有效地实现田间劳作行为的检测,对农业生产实践具有实际应用价值。 展开更多
关键词 田间劳作 行为检测 Faster R-CNN 特征金字塔网络 内容感知特征重组
原文传递
基于MobileNetV4-DSFPN的芍药田间机器人视觉导航
5
作者 徐善永 邢雪景 +1 位作者 程军辉 张俊卿 《农机化研究》 北大核心 2026年第8期169-178,共10页
精准分割田间可行驶区域并实时提取导航线是实现农业机器人在田间自主作业的关键环节。针对芍药田间背景复杂、现有语义模型计算复杂度高、实时性差等问题,提出一种轻量化的MobileNetV4-DSFPN语义分割模型,采用改进MobileNetV4作为高效... 精准分割田间可行驶区域并实时提取导航线是实现农业机器人在田间自主作业的关键环节。针对芍药田间背景复杂、现有语义模型计算复杂度高、实时性差等问题,提出一种轻量化的MobileNetV4-DSFPN语义分割模型,采用改进MobileNetV4作为高效编码器,显著压缩了参数量,并降低了计算复杂度。解码器部分构建了基于深度可分离卷积的轻量级特征金字塔网络(DSFPN),通过横向连接与转置卷积上采样,实现了高效的多尺度特征融合。基于分割结果,通过形态学优化、鲁棒边界点检测、自适应多项式拟合生成导航线。试验表明,使用数据增强可以显著提升模型预测精度和泛化能力。消融实验表明:改进MobileNetV4编码器在精度与效率的平衡上优于其他轻量网络,DSFPN解码器在精度与标准特征金字塔网络相近的同时,参数量与计算量分别降低了38.5%、29.3%。在芍药田间路径数据集上与多种分割模型进行对比,结果表明:在精度与效率上取得了最佳平衡,类别平均像素准确率(mPA)和平均交并比(mIoU)分别达到97.16%、94.11%;导航线的平均横向偏差为0.231像素,平均角度偏差为0.842°;将算法部署于车载计算机上平均推理速度达到23.1 FPS,满足导航对实时性和准确性的要求。 展开更多
关键词 芍药田间机器人 视觉导航 MobileNetV4 特征金字塔网络 轻量化 语义分割
在线阅读 下载PDF
Multi-scale object detection by top-down and bottom-up feature pyramid network 被引量:14
6
作者 ZHAO Baojun ZHAO Boya +2 位作者 TANG Linbo WANG Wenzheng WU Chen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期1-12,共12页
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ... While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps. 展开更多
关键词 convolutional neural network (CNN) feature pyramid network (fpn) object detection deconvolution.
在线阅读 下载PDF
Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs 被引量:4
7
作者 Lei Fu Wen-bin Gu +3 位作者 Wei Li Liang Chen Yong-bao Ai Hua-lei Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第4期1531-1541,共11页
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa... In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs. 展开更多
关键词 Aerial images Object detection feature pyramid networks Multi-scale feature fusion Swarm UAVs
在线阅读 下载PDF
Dual Attention Based Feature Pyramid Network 被引量:5
8
作者 Huijun Xing Shuai Wang +1 位作者 Dezhi Zheng Xiaotong Zhao 《China Communications》 SCIE CSCD 2020年第8期242-252,共11页
Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale... Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects. 展开更多
关键词 object detection convolutional neural networks feature pyramid
在线阅读 下载PDF
Neighborhood fusion-based hierarchical parallel feature pyramid network for object detection 被引量:3
9
作者 Mo Lingfei Hu Shuming 《Journal of Southeast University(English Edition)》 EI CAS 2020年第3期252-263,共12页
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid... In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy. 展开更多
关键词 computer vision deep convolutional neural network object detection hierarchical parallel feature pyramid network multi-scale feature fusion
在线阅读 下载PDF
An Improved Data-Driven Topology Optimization Method Using Feature Pyramid Networks with Physical Constraints 被引量:1
10
作者 Jiaxiang Luo Yu Li +3 位作者 Weien Zhou ZhiqiangGong Zeyu Zhang Wen Yao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期823-848,共26页
Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image ... Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance. 展开更多
关键词 Topology optimization deep learning feature pyramid networks finite element analysis physical constraints
在线阅读 下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection 被引量:1
11
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
在线阅读 下载PDF
改进ADBO-ASRS-AFPN的悬臂式掘进机主传动故障诊断方法
12
作者 吕圣林 《机电工程》 北大核心 2026年第1期92-101,共10页
现有的悬臂式掘进机主传动系统故障诊断方法,存在诊断精度低且深度学习故障诊断模型超参数敏感等问题,为此,提出了一种基于自适应蜣螂优化的自适应频谱与残差递进渐近式特征金字塔网络(ADBO-ASRS-AFPN)的故障诊断方法。首先,使用了自适... 现有的悬臂式掘进机主传动系统故障诊断方法,存在诊断精度低且深度学习故障诊断模型超参数敏感等问题,为此,提出了一种基于自适应蜣螂优化的自适应频谱与残差递进渐近式特征金字塔网络(ADBO-ASRS-AFPN)的故障诊断方法。首先,使用了自适应频谱模块,对振动信号进行了自适应高频噪声滤除;然后,采用了残差递进特征提取模块,提取了信号的多尺度时域特征;接着,对多个尺度特征采用渐近式特征金字塔进行了故障特征的语义协同增强,针对模型性能对超参数敏感的问题,引入了自适应蜣螂优化算法对该算法的关键超参数进行了自适应寻优;最后,利用模拟故障实验数据对故障诊断方法的有效性和优越性进行了验证。研究结果表明:该模型在典型工况上的故障诊断准确率达到98.23%,采用对比实验验证了该模型与其他传统模型相比具有一定的优越性;开展了消融实验,验证了各组成模块对该模型性能提升的贡献。该研究结果可为悬臂式掘进机主传动系统的故障诊断提供新方法。 展开更多
关键词 采掘机械 机械传动系统 自适应频谱模块 深度学习 优化算法 自适应蜣螂优化的自适应频谱与残差递进渐近式特征金字塔网络
在线阅读 下载PDF
ResFPN:扩增实际感受野和改进FPN的多尺度目标检测方法 被引量:4
13
作者 杨扬 唐晓芬 《计算机工程与应用》 北大核心 2025年第10期247-257,共11页
针对多尺度目标检测中主干网络实际感受野远远小于理论感受野,感受野分布稀疏,以及特征金字塔网络(feature pyramid network,FPN)在横向连接过程中统一通道数会丢失通道信息等影响模型性能的问题,提出一种扩增实际感受野和多特征融合改... 针对多尺度目标检测中主干网络实际感受野远远小于理论感受野,感受野分布稀疏,以及特征金字塔网络(feature pyramid network,FPN)在横向连接过程中统一通道数会丢失通道信息等影响模型性能的问题,提出一种扩增实际感受野和多特征融合改进FPN的多尺度目标检测算法ResFPN。针对主干网络实际感受野远远小于理论感受野的问题,设计了多分支膨胀卷积(multi-branch dilated convolutional,MBD)模块和多分支池化(multi-branch pooling,MBP)模块,通过学习不同尺度空间特征融合,扩增感受野。针对感受野分布稀疏问题,提出轻量级通道交互融合(channel interactive fusion,CIF)模块,通过双分支结构并在每一分支叠加不同数量深度可分离卷积学习像素间的依赖关系增强特征表示。针对FPN通过1×1卷积统一通道数会丢失通道信息的问题,尝试利用SubPixel卷积提取C5层输出特征,保留原始丰富语义信息的同时引出额外双向路径对FPN通道信息进行补充,但这可能会产生冗余信息。因此,在额外双向路径后引入全局上下文(global context,GC)模块,利用GC瓶颈转换模块进一步融合特征信息,减少信息冗余。实验表明,提出的ResFPN有效解决了感受野分布稀疏问题,并将主干网络感受野增大为原来的一倍,同时提出的改进FPN通道丢失问题的方法也在多尺度目标检测中获得了良好的性能。与典型的网络Faster R-CNN相比,大、中、小物体检测平均精度在具有挑战性的MS COCO数据集上分别提高了2.2、1.6、2.0个百分点,与其他检测器相比检测效果也有提升。 展开更多
关键词 目标检测 卷积神经网络 多尺度目标检测 感受野 特征金字塔网络(fpn)
在线阅读 下载PDF
Hybrid receptive field network for small object detection on drone view 被引量:1
14
作者 Zhaodong CHEN Hongbing JI +2 位作者 Yongquan ZHANG Wenke LIU Zhigang ZHU 《Chinese Journal of Aeronautics》 2025年第2期322-338,共17页
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones... Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built. 展开更多
关键词 Drone remote sensing Object detection on drone view Small object detector Hybrid receptive field feature pyramid network feature augmentation Multi-scale object detection
原文传递
Weld Defect Monitoring Based on Two-Stage Convolutional Neural Network
15
作者 XIAO Wenbo XIONG Jiakai +2 位作者 YU Lesheng HE Yinshui MA Guohong 《Journal of Shanghai Jiaotong university(Science)》 2025年第2期291-299,共9页
Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding pro... Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding process,then obtains laser fringe information through digital image processing,identifies welding defects,and finally realizes online control of weld defects.The performance of a convolutional neural network is related to its structure and the quality of the input image.The acquired original images are labeled with LabelMe,and repeated attempts are made to determine the appropriate filtering and edge detection image preprocessing methods.Two-stage convolutional neural networks with different structures are built on the Tensorflow deep learning framework,different thresholds of intersection over union are set,and deep learning methods are used to evaluate the collected original images and the preprocessed images separately.Compared with the test results,the comprehensive performance of the improved feature pyramid networks algorithm based on the basic network VGG16 is lower than that of the basic network Resnet101.Edge detection of the image will significantly improve the accuracy of the model.Adding blur will reduce the accuracy of the model slightly;however,the overall performance of the improved algorithm is still relatively good,which proves the stability of the algorithm.The self-developed software inspection system can be used for image preprocessing and defect recognition,which can be used to record the number and location of typical defects in continuous welds. 展开更多
关键词 defects monitoring image preprocessing Resnet101 feature pyramid network
原文传递
Infrared road object detection algorithm based on spatial depth channel attention network and improved YOLOv8
16
作者 LI Song SHI Tao +1 位作者 JING Fangke CUI Jie 《Optoelectronics Letters》 2025年第8期491-498,共8页
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f... Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance. 展开更多
关键词 feature pyramid network infrared road object detection infrared imagesf yolov backbone networks channel attention mechanism spatial depth channel attention network object detection improved YOLOv
原文传递
Enhancing Classroom Behavior Recognition with Lightweight Multi-Scale Feature Fusion
17
作者 Chuanchuan Wang Ahmad Sufril Azlan Mohamed +3 位作者 Xiao Yang Hao Zhang Xiang Li Mohd Halim Bin Mohd Noor 《Computers, Materials & Continua》 2025年第10期855-874,共20页
Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for ... Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors. 展开更多
关键词 Classroom action recognition YOLO-FR feature pyramid shared convolutional rep ghost cross stage partial efficient layer aggregation network(RGCSPELAN)
在线阅读 下载PDF
融合FPN与SFB的Swin Transformer图像去噪网络
18
作者 袁姮 华乾勇 《计算机系统应用》 2025年第10期32-43,共12页
为了提升图像去噪网络对局部与全局信息的捕捉能力,本文提出一种基于特征金字塔网络(feature pyramid network, FPN)和空间频率块(spatial frequency block, SFB)的Swin Transformer图像去噪网络(SwinFPSFNet).该网络由3个阶段组成:在... 为了提升图像去噪网络对局部与全局信息的捕捉能力,本文提出一种基于特征金字塔网络(feature pyramid network, FPN)和空间频率块(spatial frequency block, SFB)的Swin Transformer图像去噪网络(SwinFPSFNet).该网络由3个阶段组成:在浅层特征提取阶段,设计了特征金字塔网络以增强局部特征提取能力;在深层特征提取阶段,结合快速傅里叶卷积(fast Fourier convolution, FFC)设计空间频率块,用于同时捕捉全局与局部信息;最后,通过聚合浅层与深层特征,进一步增强网络去噪能力.此外,本文构建了一种高斯噪声退化模型并结合多种数据增强策略,以提升网络的泛化能力.在CBSD68、Kodak24和Urban100数据集上的实验结果表明,与当前主流去噪方法如BM3D、DnCNN、FFDNet、SwinIR等相比, SwinFPSFNet能够兼顾局部与全局信息,在噪声抑制和保留图像细节方面表现出显著优势. 展开更多
关键词 图像去噪 Swin Transformer 特征金字塔网络 空间频率块
在线阅读 下载PDF
基于YOLOv8-DBCS的循环水养殖环境下大口黑鲈异常体表特征检测
19
作者 朱明 汪荣 +2 位作者 万鹏 雷翔 范豪 《华中农业大学学报》 北大核心 2026年第2期269-279,共11页
大口黑鲈(Micropterus salmoides)在循环水养殖过程中容易感染细菌和病毒,得病早期体表会出现充血和白斑等异常特征。为避免大口黑鲈大规模养殖死亡,提出一种基于YOLOv8的大口黑鲈异常体表特征检测模型YOLOv8-DBCS。首先,基于StarNet网... 大口黑鲈(Micropterus salmoides)在循环水养殖过程中容易感染细菌和病毒,得病早期体表会出现充血和白斑等异常特征。为避免大口黑鲈大规模养殖死亡,提出一种基于YOLOv8的大口黑鲈异常体表特征检测模型YOLOv8-DBCS。首先,基于StarNet网络提出一种动态深度卷积(DIConv)主干网络DIStarNet,DIConv通过设计动态卷积核权重机制自适应调整卷积操作,进而有效捕捉多尺度的特征信息;其次,在颈部网络引入加权双向特征金字塔网络(bi-directional feature pyramid network,BiFPN)增强对来自主干网络多尺度信息的特征融合能力;此外,在检测头前加入CBAM(convolutional block attention module)注意力机制,提升对鱼体异常体表特征图像的学习与预测;最后将目标识别损失函数替换为SIoU(SCYLLA-intersection over union),以改善模型预测框与真实框的重合度,进一步提高模型对鱼体异常体表特征识别准确率。结果显示:YOLOv8-DBCS在检测性能上表现优异,YOLOv8-DBCS评价指标准确率(precision)、召回率(recall)、mAP_(50)和mAP_(50-95)分别为95.8%、92.4%、97.5%和66.2%;与基线模型相比分别提高3.6、4.9、7.0和3.4百分点。在模型大小上,YOLOv8-DBCS的参数量(parameters)为1.85×10~6,与基线模型相比降低了38.5%。 展开更多
关键词 异常体表特征检测 特征提取网络 特征金字塔网络 注意力机制 损失函数 大口黑鲈 循环水养殖
在线阅读 下载PDF
基于多注意力机制的脊柱病灶MRI影像识别模型
20
作者 周慧 宋新景 《计算机科学与探索》 北大核心 2026年第1期291-300,共10页
人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位... 人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位和分类是一项具有挑战性的工作。为了应对这些挑战,提出了一种改进的脊柱病灶MRI影像识别模型。引入以ResNet-101为基础的双向特征金字塔主干网络,利用可变卷积在不同层替代传统的卷积神经网络,从特征层中获得更多的特征信息。在不同的模块中加入了多重注意力,包括自注意力机制和柔性注意力机制,有效地融合特征中贡献较大的部分。为了克服脊柱肿瘤、感染性病变、稀有病布鲁氏菌的数据不平衡问题,引入了改进的平衡交叉熵损失函数。在大连某医院提供的临床数据集上进行验证,识别精确率达到了94.2%,识别召回率达到90.8%。与其他识别模型进行对比实验,结果说明了该方法相对于其他模型识别性能更好。 展开更多
关键词 脊柱病灶识别 双向特征金字塔 多注意力机制 可变卷积 多特征融合
在线阅读 下载PDF
上一页 1 2 37 下一页 到第
使用帮助 返回顶部