期刊文献+
共找到1,471篇文章
< 1 2 74 >
每页显示 20 50 100
Feature pyramid attention network for audio-visual scene classification 被引量:1
1
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
2
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
Hyperspectral Satellite Image Classification Based on Feature Pyramid Networks With 3D Convolution
3
作者 CHEN Cheng PENG Pan +1 位作者 TAO Wei ZHAO Hui 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1073-1084,共12页
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N... Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods. 展开更多
关键词 hyperspectral image(HSI) deep learning feature pyramid network(FPN) spectral-spatial feature extraction
原文传递
Optimized Convolutional Neural Networks with Multi-Scale Pyramid Feature Integration for Efficient Traffic Light Detection in Intelligent Transportation Systems
4
作者 Yahia Said Yahya Alassaf +2 位作者 Refka Ghodhbani Taoufik Saidani Olfa Ben Rhaiem 《Computers, Materials & Continua》 2025年第2期3005-3018,共14页
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio... Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks. 展开更多
关键词 Intelligent transportation systems(ITS) traffic light detection multi-scale pyramid feature maps advanced driver assistance systems(ADAS) real-time detection AI in transportation
在线阅读 下载PDF
Multi-scale object detection by top-down and bottom-up feature pyramid network 被引量:14
5
作者 ZHAO Baojun ZHAO Boya +2 位作者 TANG Linbo WANG Wenzheng WU Chen 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第1期1-12,共12页
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ... While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps. 展开更多
关键词 convolutional neural NETWORK (CNN) feature pyramid NETWORK (FPN) object detection deconvolution.
在线阅读 下载PDF
Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs 被引量:4
6
作者 Lei Fu Wen-bin Gu +3 位作者 Wei Li Liang Chen Yong-bao Ai Hua-lei Wang 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2021年第4期1531-1541,共11页
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa... In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs. 展开更多
关键词 Aerial images Object detection feature pyramid networks Multi-scale feature fusion Swarm UAVs
在线阅读 下载PDF
Dual Attention Based Feature Pyramid Network 被引量:5
7
作者 Huijun Xing Shuai Wang +1 位作者 Dezhi Zheng Xiaotong Zhao 《China Communications》 SCIE CSCD 2020年第8期242-252,共11页
Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale... Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects. 展开更多
关键词 object detection convolutional neural networks feature pyramid
在线阅读 下载PDF
Neighborhood fusion-based hierarchical parallel feature pyramid network for object detection 被引量:3
8
作者 Mo Lingfei Hu Shuming 《Journal of Southeast University(English Edition)》 EI CAS 2020年第3期252-263,共12页
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid... In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy. 展开更多
关键词 computer vision deep convolutional neural network object detection hierarchical parallel feature pyramid network multi-scale feature fusion
在线阅读 下载PDF
An Improved Data-Driven Topology Optimization Method Using Feature Pyramid Networks with Physical Constraints 被引量:1
9
作者 Jiaxiang Luo Yu Li +3 位作者 Weien Zhou ZhiqiangGong Zeyu Zhang Wen Yao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2021年第9期823-848,共26页
Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image ... Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance. 展开更多
关键词 Topology optimization deep learning feature pyramid networks finite element analysis physical constraints
在线阅读 下载PDF
Enhancing Classroom Behavior Recognition with Lightweight Multi-Scale Feature Fusion
10
作者 Chuanchuan Wang Ahmad Sufril Azlan Mohamed +3 位作者 Xiao Yang Hao Zhang Xiang Li Mohd Halim Bin Mohd Noor 《Computers, Materials & Continua》 2025年第10期855-874,共20页
Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for ... Classroom behavior recognition is a hot research topic,which plays a vital role in assessing and improving the quality of classroom teaching.However,existing classroom behavior recognition methods have challenges for high recognition accuracy with datasets with problems such as scenes with blurred pictures,and inconsistent objects.To address this challenge,we proposed an effective,lightweight object detector method called the RFNet model(YOLO-FR).The YOLO-FR is a lightweight and effective model.Specifically,for efficient multi-scale feature extraction,effective feature pyramid shared convolutional(FPSC)was designed to improve the feature extract performance by leveraging convolutional layers with varying dilation rates from the input image in the backbone.Secondly,to address the problem of multi-scale variability in the scene,we design the Rep Ghost fusion Cross Stage Partial and Efficient Layer Aggregation Network(RGCSPELAN)to improve the network performance further and reduce the amount of computation and the number of parameters.In addition,by conducting experimental valuation on the SCB dataset3 and STBD-08 dataset.Experimental results indicate that,compared to the baseline model,the RFNet model has increased mean accuracy precision(mAP@50)from 69.6%to 71.0%on the SCB dataset3 and from 91.8%to 93.1%on the STBD-08 dataset.The RFNet approach has effectiveness precision at 68.6%,surpassing the baseline method(YOLOv11)at 3.3%and archieve the minimal size(4.9 M)on the SCB dataset3.Finally,comparing it with other algorithms,it accurately detects student behavior in complex classroom environments results confirmed that RFNet is well-suited for real-time and efficiently recognizing classroom behaviors. 展开更多
关键词 Classroom action recognition YOLO-FR feature pyramid shared convolutional rep ghost cross stage partial efficient layer aggregation network(RGCSPELAN)
在线阅读 下载PDF
Gender-Specific Multi-Task Micro-Expression Recognition Using Pyramid CGBP-TOP Feature
11
作者 Chunlong Hu Jianjun Chen +3 位作者 Xin Zuo Haitao Zou Xing Deng Yucheng Shu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2019年第3期547-559,共13页
Micro-expression recognition has attracted growing research interests in the field of compute vision.However,micro-expression usually lasts a few seconds,thus it is difficult to detect.This paper presents a new framew... Micro-expression recognition has attracted growing research interests in the field of compute vision.However,micro-expression usually lasts a few seconds,thus it is difficult to detect.This paper presents a new framework to recognize micro-expression using pyramid histogram of Centralized Gabor Binary Pattern from Three Orthogonal Panels(CGBP-TOP)which is an extension of Local Gabor Binary Pattern from Three Orthogonal Panels feature.CGBP-TOP performs spatial and temporal analysis to capture the local facial characteristics of micro-expression image sequences.In order to keep more local information of the face,CGBP-TOP is extracted based on pyramid subregions of the micro-expression video frame.The combination of CGBP-TOP and spatial pyramid can represent well and truly the facial movements of the micro-expression image sequences.However,the dimension of our pyramid CGBP-TOP tends to be very high,which may lead to high data redundancy problem.In addition,it is clear that people of different genders usually have different ways of micro-expression.Therefore,in this paper,in order to select the relevant features of micro-expression,the gender-specific sparse multi-task learning method with adaptive regularization term is adopted to learn a compact subset of pyramid CGBP-TOP feature for micro-expression classification of different sexes.Finally,extensive experiments on widely used CASME II and SMIC databases demonstrate that our method can efficiently extract micro-expression motion features in the micro-expression video clip.Moreover,our proposed approach achieves comparable results with the state-of-the-art methods. 展开更多
关键词 Micro-expression recognition feature extraction spatial pyramid MULTI-TASK learning REGULARIZATION
在线阅读 下载PDF
IMTNet:Improved Multi-Task Copy-Move Forgery Detection Network with Feature Decoupling and Multi-Feature Pyramid
12
作者 Huan Wang Hong Wang +2 位作者 Zhongyuan Jiang Qing Qian Yong Long 《Computers, Materials & Continua》 SCIE EI 2024年第9期4603-4620,共18页
Copy-Move Forgery Detection(CMFD)is a technique that is designed to identify image tampering and locate suspicious areas.However,the practicality of the CMFD is impeded by the scarcity of datasets,inadequate quality a... Copy-Move Forgery Detection(CMFD)is a technique that is designed to identify image tampering and locate suspicious areas.However,the practicality of the CMFD is impeded by the scarcity of datasets,inadequate quality and quantity,and a narrow range of applicable tasks.These limitations significantly restrict the capacity and applicability of CMFD.To overcome the limitations of existing methods,a novel solution called IMTNet is proposed for CMFD by employing a feature decoupling approach.Firstly,this study formulates the objective task and network relationship as an optimization problem using transfer learning.Furthermore,it thoroughly discusses and analyzes the relationship between CMFD and deep network architecture by employing ResNet-50 during the optimization solving phase.Secondly,a quantitative comparison between fine-tuning and feature decoupling is conducted to evaluate the degree of similarity between the image classification and CMFD domains by the enhanced ResNet-50.Finally,suspicious regions are localized using a feature pyramid network with bottom-up path augmentation.Experimental results demonstrate that IMTNet achieves faster convergence,shorter training times,and favorable generalization performance compared to existingmethods.Moreover,it is shown that IMTNet significantly outperforms fine-tuning based approaches in terms of accuracy and F_(1). 展开更多
关键词 Image copy-move detection feature decoupling multi-scale feature pyramids passive forensics
在线阅读 下载PDF
Two-Layer Attention Feature Pyramid Network for Small Object Detection
13
作者 Sheng Xiang Junhao Ma +2 位作者 Qunli Shang Xianbao Wang Defu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期713-731,共19页
Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les... Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors. 展开更多
关键词 Small object detection two-layer attention module small object detail enhancement module feature pyramid network
在线阅读 下载PDF
Product Image Classification Based on Fusion Features
14
作者 杨晓慧 刘静静 杨利军 《Chinese Quarterly Journal of Mathematics》 2015年第3期429-441,共13页
Two key challenges raised by a product images classification system are classification precision and classification time. In some categories, classification precision of the latest techniques, in the product images cl... Two key challenges raised by a product images classification system are classification precision and classification time. In some categories, classification precision of the latest techniques, in the product images classification system, is still low. In this paper, we propose a local texture descriptor termed fan refined local binary pattern, which captures more detailed information by integrating the spatial distribution into the local binary pattern feature. We compare our approach with different methods on a subset of product images on Amazon/e Bay and parts of PI100 and experimental results have demonstrated that our proposed approach is superior to the current existing methods. The highest classification precision is increased by 21% and the average classification time is reduced by 2/3. 展开更多
关键词 product image CLASSIFICATION FAN refined local binary pattern(FRLBP) pyramid HISTOGRAM of orientated gradients(PHOG) FUSION featureS
在线阅读 下载PDF
基于多注意力机制的脊柱病灶MRI影像识别模型
15
作者 周慧 宋新景 《计算机科学与探索》 北大核心 2026年第1期291-300,共10页
人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位... 人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位和分类是一项具有挑战性的工作。为了应对这些挑战,提出了一种改进的脊柱病灶MRI影像识别模型。引入以ResNet-101为基础的双向特征金字塔主干网络,利用可变卷积在不同层替代传统的卷积神经网络,从特征层中获得更多的特征信息。在不同的模块中加入了多重注意力,包括自注意力机制和柔性注意力机制,有效地融合特征中贡献较大的部分。为了克服脊柱肿瘤、感染性病变、稀有病布鲁氏菌的数据不平衡问题,引入了改进的平衡交叉熵损失函数。在大连某医院提供的临床数据集上进行验证,识别精确率达到了94.2%,识别召回率达到90.8%。与其他识别模型进行对比实验,结果说明了该方法相对于其他模型识别性能更好。 展开更多
关键词 脊柱病灶识别 双向特征金字塔 多注意力机制 可变卷积 多特征融合
在线阅读 下载PDF
基于改进Faster R—CNN的水稻秧苗漏插识别研究
16
作者 邹立雯 梁春英 +2 位作者 胡军 陈玉恒 李圳鹏 《中国农机化学报》 北大核心 2026年第2期101-107,共7页
水稻是我国的主要粮食作物,实现水稻的高产优产是必然趋势。针对传统人工补苗效率低、主观性高的问题,提出一种基于改进Faster R—CNN的水稻秧苗漏插识别方法。以Faster R—CNN模型为基础,将主干网络替换为残差网络ResNet50,结合FPN特... 水稻是我国的主要粮食作物,实现水稻的高产优产是必然趋势。针对传统人工补苗效率低、主观性高的问题,提出一种基于改进Faster R—CNN的水稻秧苗漏插识别方法。以Faster R—CNN模型为基础,将主干网络替换为残差网络ResNet50,结合FPN特征金字塔对特征信息进行提取;引入RoI Align双线性插值的思想替代RoI Pooling层粗糙量化操作。结果表明,改进后的Faster R—CNN模型识别的精确率为93.62%,平均精度均值mAP@0.5为95.06%;与未改进的模型相比,识别精确率提高7.33%,模型的平均精度均值mAP@0.5提高4.6%。该模型可以提高水稻秧苗的分类和插秧机漏插位置的检测精度,为制定水稻秧苗补苗计划打下坚实的基础,并为评价水稻插秧机质量提供数据支持。 展开更多
关键词 水稻秧苗 漏插识别 特征金字塔 深度学习 残差网络
在线阅读 下载PDF
改进YOLOv8_obb的大豆主茎节点识别研究
17
作者 杨彦旭 李金阳 +2 位作者 石文强 亓立强 张伟 《中国农机化学报》 北大核心 2026年第1期79-86,共8页
大豆株型对大豆产量有重要影响,大豆主茎节数是大豆株型构成的重要性状。为实现田间条件下大豆主茎节数识别计算,以黑龙江省九三地区大豆为研究对象,提出基于YOLOv8_obb模型改进的大豆主茎节点识别方法YOLOv8_obb—AES,计算大豆主茎节点... 大豆株型对大豆产量有重要影响,大豆主茎节数是大豆株型构成的重要性状。为实现田间条件下大豆主茎节数识别计算,以黑龙江省九三地区大豆为研究对象,提出基于YOLOv8_obb模型改进的大豆主茎节点识别方法YOLOv8_obb—AES,计算大豆主茎节点,得到大豆主茎节数。改进模型引入高效注意力机制模块,缩减模型计算量,采用渐进特征金字塔网络结构替换YOLOv8_obb网络中的路径聚合特征金字塔网络,增强多尺度融合能力,替换IoU损失函数加快模型边界回归,提高模型收敛速度。结果表明,YOLOv8_obb—AES算法对田间大豆主茎节点的平均精度均值与检测速度分别达到89.45%、78.8帧/ms,相比于原始算法分别提升8.45%、7.6帧/ms,对于九研17大豆植株6种不同主茎节数的识别准确率分别为85.4%、84.5%、87.6%、85.2%、81.6%和82.2%。该研究为探究大豆产量与大豆主茎节数之间的关联提供技术支持。 展开更多
关键词 大豆 主茎节点 目标识别 渐进特征金字塔网络 高效注意力机制
在线阅读 下载PDF
基于跨尺度特征融合的内窥镜图像增强算法
18
作者 刘旭阳 蔡芸 蒋林 《现代电子技术》 北大核心 2026年第1期34-40,共7页
临床医学的内窥镜图像由于在成像过程中存在补充光源不均匀和人体组织粘液反光的问题,出现大量曝光过度等图像质量较低的现象。现有基于深度学习的图像增强算法由于仅采用固定尺寸的特征融合方式,导致特征提取能力较低、增强效果较差。... 临床医学的内窥镜图像由于在成像过程中存在补充光源不均匀和人体组织粘液反光的问题,出现大量曝光过度等图像质量较低的现象。现有基于深度学习的图像增强算法由于仅采用固定尺寸的特征融合方式,导致特征提取能力较低、增强效果较差。为改善这一问题,文中构建了基于跨尺度特征融合的内窥镜图像增强算法,通过构建CM卷积模块实现高性能特征提取,同时采用SPPF金字塔池化模块实现对特征图不同尺度的池化操作,并且在网络不同尺度的网络层之间引入跨尺度特征融合(CFF)模块,实现多尺度特征融合和上下文信息传播,从而大幅提高图像细节捕捉能力和图像质量。实验结果表明,文中算法在PSNR、SSIM指标均高于现有算法,其中PSNR指标提高了9.9%,SSIM指标提高了15.4%,可以实现高质量内窥镜图像增强任务。 展开更多
关键词 内窥镜图像 深度特征融合 CFF 曝光异常 图像增强算法 金字塔池化模块
在线阅读 下载PDF
FireLight-YOLO:面向森林火灾实时监测的轻量化模型
19
作者 李敏学 张晓宇 +2 位作者 程英杰 霍光煜 许福 《北京林业大学学报》 北大核心 2026年第1期12-25,共14页
【目的】为应对森林火灾频发对生态安全构成的严峻挑战,构建轻量化实时智能监测体系以提升生态风险防控能力具有重要现实意义。针对现有火灾检测方法易受环境干扰,且模型复杂度与实时性难以兼顾的问题,本研究旨在开发一种无需外部预训... 【目的】为应对森林火灾频发对生态安全构成的严峻挑战,构建轻量化实时智能监测体系以提升生态风险防控能力具有重要现实意义。针对现有火灾检测方法易受环境干扰,且模型复杂度与实时性难以兼顾的问题,本研究旨在开发一种无需外部预训练权重即可从零训练的高效轻量化检测模型。【方法】研究首先构建了涵盖1万余张高质量图像的森林火灾监测数据集并开源发布。在此基础上,基于YOLOv8提出FireLight-YOLO轻量化架构:引入幽灵卷积压缩冗余计算,设计融合部分卷积与点态卷积的FasterC2fBlock构建T形感受野以增强关键区域感知,并优化SPPF模块提出特征金字塔共享卷积机制实现高效跨尺度特征融合。模型通过交叉验证、独立测试、消融实验及多噪声场景鲁棒性检验完成性能评估。【结果】FireLight-YOLO在未使用预训练权重条件下实现mAP@0.5达0.491,仅需约2.26×10^(6)参数与5.9GFLOPs计算量,在精度、轻量化与实时性间达到有效平衡。相较于原始YOLOv8,模型计算量减少2.2 GFLOPs,参数量降低了25%,推理速度提升15%,并在复杂干扰场景中展现出优异的鲁棒性。【结论】FireLight-YOLO实现了轻量化条件下对森林火灾的精准检测。该研究不仅为森林火灾智能监测提供了低成本、高效率的技术方案,其轻量化特性亦显著增强了模型在移动终端的部署适应性。研究成果可为森林生态系统的保护与修复提供坚实的智能化支撑。 展开更多
关键词 YOLOv8 Ghost卷积 森林火灾检测 实时目标检测 轻量化模型 特征金字塔共享卷积(FPSC) 边缘部署 生态安全
在线阅读 下载PDF
EHDC-YOLO: Enhancing Object Detection for UAV Imagery via Multi-Scale Edge and Detail Capture
20
作者 Zhiyong Deng Yanchen Ye Jiangling Guo 《Computers, Materials & Continua》 2026年第1期1665-1682,共18页
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ... With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios. 展开更多
关键词 UAV imagery object detection multi-scale feature fusion edge enhancement detail preservation YOLO feature pyramid network attention mechanism
在线阅读 下载PDF
上一页 1 2 74 下一页 到第
使用帮助 返回顶部