期刊文献+
共找到688篇文章
< 1 2 35 >
每页显示 20 50 100
Feature pyramid attention network for audio-visual scene classification 被引量:1
1
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
2
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
Multi-scale object detection by top-down and bottom-up feature pyramid network 被引量:14
3
作者 ZHAO Baojun ZHAO Boya +2 位作者 TANG Linbo WANG Wenzheng WU Chen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期1-12,共12页
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ... While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps. 展开更多
关键词 convolutional neural network (CNN) feature pyramid network (fpn) object detection deconvolution.
在线阅读 下载PDF
Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs 被引量:4
4
作者 Lei Fu Wen-bin Gu +3 位作者 Wei Li Liang Chen Yong-bao Ai Hua-lei Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第4期1531-1541,共11页
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa... In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs. 展开更多
关键词 Aerial images Object detection feature pyramid networks Multi-scale feature fusion Swarm UAVs
在线阅读 下载PDF
Dual Attention Based Feature Pyramid Network 被引量:5
5
作者 Huijun Xing Shuai Wang +1 位作者 Dezhi Zheng Xiaotong Zhao 《China Communications》 SCIE CSCD 2020年第8期242-252,共11页
Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale... Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects. 展开更多
关键词 object detection convolutional neural networks feature pyramid
在线阅读 下载PDF
Neighborhood fusion-based hierarchical parallel feature pyramid network for object detection 被引量:3
6
作者 Mo Lingfei Hu Shuming 《Journal of Southeast University(English Edition)》 EI CAS 2020年第3期252-263,共12页
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid... In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy. 展开更多
关键词 computer vision deep convolutional neural network object detection hierarchical parallel feature pyramid network multi-scale feature fusion
在线阅读 下载PDF
An Improved Data-Driven Topology Optimization Method Using Feature Pyramid Networks with Physical Constraints 被引量:1
7
作者 Jiaxiang Luo Yu Li +3 位作者 Weien Zhou ZhiqiangGong Zeyu Zhang Wen Yao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期823-848,共26页
Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image ... Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance. 展开更多
关键词 Topology optimization deep learning feature pyramid networks finite element analysis physical constraints
在线阅读 下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection
8
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
在线阅读 下载PDF
ResFPN:扩增实际感受野和改进FPN的多尺度目标检测方法 被引量:3
9
作者 杨扬 唐晓芬 《计算机工程与应用》 北大核心 2025年第10期247-257,共11页
针对多尺度目标检测中主干网络实际感受野远远小于理论感受野,感受野分布稀疏,以及特征金字塔网络(feature pyramid network,FPN)在横向连接过程中统一通道数会丢失通道信息等影响模型性能的问题,提出一种扩增实际感受野和多特征融合改... 针对多尺度目标检测中主干网络实际感受野远远小于理论感受野,感受野分布稀疏,以及特征金字塔网络(feature pyramid network,FPN)在横向连接过程中统一通道数会丢失通道信息等影响模型性能的问题,提出一种扩增实际感受野和多特征融合改进FPN的多尺度目标检测算法ResFPN。针对主干网络实际感受野远远小于理论感受野的问题,设计了多分支膨胀卷积(multi-branch dilated convolutional,MBD)模块和多分支池化(multi-branch pooling,MBP)模块,通过学习不同尺度空间特征融合,扩增感受野。针对感受野分布稀疏问题,提出轻量级通道交互融合(channel interactive fusion,CIF)模块,通过双分支结构并在每一分支叠加不同数量深度可分离卷积学习像素间的依赖关系增强特征表示。针对FPN通过1×1卷积统一通道数会丢失通道信息的问题,尝试利用SubPixel卷积提取C5层输出特征,保留原始丰富语义信息的同时引出额外双向路径对FPN通道信息进行补充,但这可能会产生冗余信息。因此,在额外双向路径后引入全局上下文(global context,GC)模块,利用GC瓶颈转换模块进一步融合特征信息,减少信息冗余。实验表明,提出的ResFPN有效解决了感受野分布稀疏问题,并将主干网络感受野增大为原来的一倍,同时提出的改进FPN通道丢失问题的方法也在多尺度目标检测中获得了良好的性能。与典型的网络Faster R-CNN相比,大、中、小物体检测平均精度在具有挑战性的MS COCO数据集上分别提高了2.2、1.6、2.0个百分点,与其他检测器相比检测效果也有提升。 展开更多
关键词 目标检测 卷积神经网络 多尺度目标检测 感受野 特征金字塔网络(fpn)
在线阅读 下载PDF
Hybrid receptive field network for small object detection on drone view 被引量:1
10
作者 Zhaodong CHEN Hongbing JI +2 位作者 Yongquan ZHANG Wenke LIU Zhigang ZHU 《Chinese Journal of Aeronautics》 2025年第2期322-338,共17页
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones... Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built. 展开更多
关键词 Drone remote sensing Object detection on drone view Small object detector Hybrid receptive field feature pyramid network feature augmentation Multi-scale object detection
原文传递
Weld Defect Monitoring Based on Two-Stage Convolutional Neural Network
11
作者 XIAO Wenbo XIONG Jiakai +2 位作者 YU Lesheng HE Yinshui MA Guohong 《Journal of Shanghai Jiaotong university(Science)》 2025年第2期291-299,共9页
Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding pro... Zn vapour is easily generated on the surface by fusion welding galvanized steel sheet,resulting in the formation of defects.Rapidly developing computer vision sensing technology collects weld images in the welding process,then obtains laser fringe information through digital image processing,identifies welding defects,and finally realizes online control of weld defects.The performance of a convolutional neural network is related to its structure and the quality of the input image.The acquired original images are labeled with LabelMe,and repeated attempts are made to determine the appropriate filtering and edge detection image preprocessing methods.Two-stage convolutional neural networks with different structures are built on the Tensorflow deep learning framework,different thresholds of intersection over union are set,and deep learning methods are used to evaluate the collected original images and the preprocessed images separately.Compared with the test results,the comprehensive performance of the improved feature pyramid networks algorithm based on the basic network VGG16 is lower than that of the basic network Resnet101.Edge detection of the image will significantly improve the accuracy of the model.Adding blur will reduce the accuracy of the model slightly;however,the overall performance of the improved algorithm is still relatively good,which proves the stability of the algorithm.The self-developed software inspection system can be used for image preprocessing and defect recognition,which can be used to record the number and location of typical defects in continuous welds. 展开更多
关键词 defects monitoring image preprocessing Resnet101 feature pyramid network
原文传递
Infrared road object detection algorithm based on spatial depth channel attention network and improved YOLOv8
12
作者 LI Song SHI Tao +1 位作者 JING Fangke CUI Jie 《Optoelectronics Letters》 2025年第8期491-498,共8页
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f... Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance. 展开更多
关键词 feature pyramid network infrared road object detection infrared imagesf yolov backbone networks channel attention mechanism spatial depth channel attention network object detection improved YOLOv
原文传递
Enhancing Classroom Behavior Recognition with Lightweight Multi-Scale Feature Fusion
13
作者 Chuanchuan Wang Ahmad Sufril Azlan Mohamed +3 位作者 Xiao Yang Hao Zhang Xiang Li Mohd Halim Bin Mohd Noor 《Computers, Materials & Continua》 2025年第10期855-874,共20页
Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for ... Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors. 展开更多
关键词 Classroom action recognition YOLO-FR feature pyramid shared convolutional rep ghost cross stage partial efficient layer aggregation network(RGCSPELAN)
在线阅读 下载PDF
融合FPN与SFB的Swin Transformer图像去噪网络
14
作者 袁姮 华乾勇 《计算机系统应用》 2025年第10期32-43,共12页
为了提升图像去噪网络对局部与全局信息的捕捉能力,本文提出一种基于特征金字塔网络(feature pyramid network, FPN)和空间频率块(spatial frequency block, SFB)的Swin Transformer图像去噪网络(SwinFPSFNet).该网络由3个阶段组成:在... 为了提升图像去噪网络对局部与全局信息的捕捉能力,本文提出一种基于特征金字塔网络(feature pyramid network, FPN)和空间频率块(spatial frequency block, SFB)的Swin Transformer图像去噪网络(SwinFPSFNet).该网络由3个阶段组成:在浅层特征提取阶段,设计了特征金字塔网络以增强局部特征提取能力;在深层特征提取阶段,结合快速傅里叶卷积(fast Fourier convolution, FFC)设计空间频率块,用于同时捕捉全局与局部信息;最后,通过聚合浅层与深层特征,进一步增强网络去噪能力.此外,本文构建了一种高斯噪声退化模型并结合多种数据增强策略,以提升网络的泛化能力.在CBSD68、Kodak24和Urban100数据集上的实验结果表明,与当前主流去噪方法如BM3D、DnCNN、FFDNet、SwinIR等相比, SwinFPSFNet能够兼顾局部与全局信息,在噪声抑制和保留图像细节方面表现出显著优势. 展开更多
关键词 图像去噪 Swin Transformer 特征金字塔网络 空间频率块
在线阅读 下载PDF
Hyperspectral Satellite Image Classification Based on Feature Pyramid Networks With 3D Convolution
15
作者 CHEN Cheng PENG Pan +1 位作者 TAO Wei ZHAO Hui 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1073-1084,共12页
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N... Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods. 展开更多
关键词 hyperspectral image(HSI) deep learning feature pyramid network(fpn) spectral-spatial feature extraction
原文传递
基于多注意力机制的脊柱病灶MRI影像识别模型
16
作者 周慧 宋新景 《计算机科学与探索》 北大核心 2026年第1期291-300,共10页
人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位... 人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位和分类是一项具有挑战性的工作。为了应对这些挑战,提出了一种改进的脊柱病灶MRI影像识别模型。引入以ResNet-101为基础的双向特征金字塔主干网络,利用可变卷积在不同层替代传统的卷积神经网络,从特征层中获得更多的特征信息。在不同的模块中加入了多重注意力,包括自注意力机制和柔性注意力机制,有效地融合特征中贡献较大的部分。为了克服脊柱肿瘤、感染性病变、稀有病布鲁氏菌的数据不平衡问题,引入了改进的平衡交叉熵损失函数。在大连某医院提供的临床数据集上进行验证,识别精确率达到了94.2%,识别召回率达到90.8%。与其他识别模型进行对比实验,结果说明了该方法相对于其他模型识别性能更好。 展开更多
关键词 脊柱病灶识别 双向特征金字塔 多注意力机制 可变卷积 多特征融合
在线阅读 下载PDF
共享核空洞卷积与注意力引导FPN文本检测 被引量:4
17
作者 孟月波 金丹 +3 位作者 刘光辉 徐胜军 韩九强 石德旺 《光学精密工程》 EI CAS CSCD 北大核心 2021年第8期1955-1967,共13页
高分辨率图像具有特征尺度差异较大的特点,针对其造成的细粒度特征难以捕获、多尺度特征融合不佳问题,提出一种共享核空洞卷积与注意力引导(Kernel-Sharing Dilated Convolutions and Attention-guided FPN,KDA-FPN)的复杂场景文本检测... 高分辨率图像具有特征尺度差异较大的特点,针对其造成的细粒度特征难以捕获、多尺度特征融合不佳问题,提出一种共享核空洞卷积与注意力引导(Kernel-Sharing Dilated Convolutions and Attention-guided FPN,KDA-FPN)的复杂场景文本检测方法;提出最小交集(Intersection Over Minimum,IOM)后处理策略,改善因文本长宽比变化较大特性导致的掩膜重叠现象,提升检测效果。首先,模型以Resnet50为主干网络采用FPN结构捕获多尺度特征;然后,利用空洞卷积扩大特征感受野,提高特征信息的多尺度捕获能力,深层次挖掘文本细粒度特征,并通过共享核手段减少模型参数量,降低计算成本;同时,采用上下文注意模块(Context Attention Module,CxAM)捕捉多感受野间的语义信息关系,通过内容注意模块(Content Attention Module,CnAM)精确定位目标位置信息,增强多尺度融合能力,提升特征图质量;最后,将同一文本区域预测的候选框按大小排列,提出将面积最大的框与相邻文本框之间区域的交集面积占较小框面积的比值作为候选框筛选指标,抑制检测结果的掩模重叠现象,实现文本的精准检测。采用ICDAR2013、ICDAR2015、TotalText数据集进行对比实验,实验结果表明,本文模型对于水平场景文本检测的精度和召回率分别为95.3和90.4;对于倾斜文本检测的精度和召回率分别为87.1和84.2;对于任意形状文本检测的精度和召回率分别为69.6和57.3。提出的算法有效克服了图像分辨率、文本形状与长度等因素的影响,提高了检测精度,得到了更为精准的文本边界。 展开更多
关键词 文本检测 注意力结构 共享核空洞卷积 特征金字塔网络
在线阅读 下载PDF
改进损失函数的增强型FPN水下小目标检测 被引量:12
18
作者 乔美英 史建柯 +2 位作者 李冰锋 赵岩 史有强 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2023年第4期525-537,共13页
针对水下小目标因携带特征信息少、定位不精准而导致检测精度低的问题,提出一种特征金字塔网络(FPN).首先,在FPN上采样过程中加入协同非局部注意力模块,利用卷积、横纵向池化挖掘特征图的静态和动态上下文信息;其次,在FPN通道调整过程... 针对水下小目标因携带特征信息少、定位不精准而导致检测精度低的问题,提出一种特征金字塔网络(FPN).首先,在FPN上采样过程中加入协同非局部注意力模块,利用卷积、横纵向池化挖掘特征图的静态和动态上下文信息;其次,在FPN通道调整过程中加入三叉戟特征增强模块,利用并行空洞卷积与高效通道注意力(ECANet)捕捉多尺度空间与通道特征信息;最后,在FasterR-CNN算法的回归损失函数中引入线性回归损失增益系数,增大对多尺度目标回归偏移量的惩罚,提高定位精度.实验结果表明,采用2020年全国水下目标检测大赛提供的数据集、PASCALVOC数据集和MSCOCO数据集进行实验,该算法比基线FasterR-CNN算法精度分别提升2.8%,2.2%和2.5%,结果证明了其有效性. 展开更多
关键词 水下目标检测 小目标检测 特征金字塔网络 损失函数 Faster R-CNN
在线阅读 下载PDF
基于MobileNetV2和IFPN改进的SSD垃圾实时分类检测方法 被引量:15
19
作者 赵珊 刘子路 +1 位作者 郑爱玲 高雨 《计算机应用》 CSCD 北大核心 2022年第S01期106-111,共6页
针对垃圾分类检测任务中检测目标尺寸不一和小目标检测精度不高等问题,构建一种基于隐式特征金字塔网络(IFPN)和MobileNetV2的改进SSD模型的分类检测方法,对垃圾进行实时分类检测。首先,将改进后的MobileNetV2引入SSD,加入带有空洞卷积... 针对垃圾分类检测任务中检测目标尺寸不一和小目标检测精度不高等问题,构建一种基于隐式特征金字塔网络(IFPN)和MobileNetV2的改进SSD模型的分类检测方法,对垃圾进行实时分类检测。首先,将改进后的MobileNetV2引入SSD,加入带有空洞卷积的空间金字塔池化模块(ASPP),在降低网络模型计算复杂度的同时保证网络实时性和精确性;其次,采用IFPN从网络的深层到浅层逐级融合SSD,更精确地检测出小目标;最后,使用Focal Loss函数调节正负样本之间的权重。实验结果表明,在阈值为0.4时,所提方法比传统SSD平均精确率均值(mAP)提高了4.84个百分点,检测耗时减少了72.7%,能满足边缘计算设备对模型的各项要求。 展开更多
关键词 垃圾分类 目标检测 MobileNetV2 SSD 空间金字塔池化 隐式特征金字塔网络
在线阅读 下载PDF
高频增强网络与FPN融合的水下目标检测 被引量:3
20
作者 乔美英 赵岩 +1 位作者 史建柯 史有强 《电子测量技术》 北大核心 2023年第13期146-154,共9页
针对水下目标检测中目标对比度低以及水下图像多尺度问题,提出了高频增强网络与特征金字塔(FPN)融合的水下目标检测算法,以提高对水下目标边缘、轮廓信息以及目标底层信息的提取。首先引入八度卷积将卷积层的输出特征按频率分解,将主干... 针对水下目标检测中目标对比度低以及水下图像多尺度问题,提出了高频增强网络与特征金字塔(FPN)融合的水下目标检测算法,以提高对水下目标边缘、轮廓信息以及目标底层信息的提取。首先引入八度卷积将卷积层的输出特征按频率分解,将主干网络提取到的特征图进行高、低频信息分离,鉴于水下目标的轮廓信息和噪声信息均包含于高频特征中,在高频信息通道中引入通道信息具有自适应增强特点的通道注意力机制,形成了一种高频增强卷积,以达到增强有用轮廓特征信息和抑制噪声的目的;其次,将增强的高频特征分量融入FPN的浅层网络中,提高原FPN对水下多尺度目标的特征表示能力,缓解多尺度目标漏检问题。最后,将所提方法与基线算法Faster R-CNN融合,在全国水下机器人大赛提供的数据集中进行实验。结果表明:改进算法识别准确率达到78.83%,相比基线提升2.61%,与其他类型目标检测算法相比,依然具备精度和实时检测优势,证明了从特征图频域角度提升前景和背景对比度的有效性。 展开更多
关键词 深度学习 水下目标检测 小目标检测 特征金字塔 八度卷积 通道注意力
原文传递
上一页 1 2 35 下一页 到第
使用帮助 返回顶部