期刊文献+
共找到6,008篇文章
< 1 2 250 >
每页显示 20 50 100
LP-YOLO:Enhanced Smoke and Fire Detection via Self-Attention and Feature Pyramid Integration
1
作者 Qing Long Bing Yi +2 位作者 Haiqiao Liu Zhiling Peng Xiang Liu 《Computers, Materials & Continua》 2026年第3期1490-1509,共20页
Accurate detection of smoke and fire sources is critical for early fire warning and environmental monitoring.However,conventional detection approaches are highly susceptible to noise,illumination variations,and comple... Accurate detection of smoke and fire sources is critical for early fire warning and environmental monitoring.However,conventional detection approaches are highly susceptible to noise,illumination variations,and complex environmental conditions,which often reduce detection accuracy and real-time performance.To address these limitations,we propose Lightweight and Precise YOLO(LP-YOLO),a high-precision detection framework that integrates a self-attention mechanism with a feature pyramid,built upon YOLOv8.First,to overcome the restricted receptive field and parameter redundancy of conventional Convolutional Neural Networks(CNNs),we design an enhanced backbone based on Wavelet Convolutions(WTConv),which expands the receptive field through multifrequency convolutional processing.Second,a Bidirectional Feature Pyramid Network(BiFPN)is employed to achieve bidirectional feature fusion,enhancing the representation of smoke features across scales.Third,to mitigate the challenge of ambiguous object boundaries,we introduce the Frequency-aware Feature Fusion(FreqFusion)module,in which the Adaptive Low-Pass Filter(ALPF)reduces intra-class inconsistencies,the offset generator refines boundary localization,and the Adaptive High-Pass Filter(AHPF)recovers high-frequency details lost during down-sampling.Experimental evaluations demonstrate that LP-YOLO significantly outperforms the baseline YOLOv8,achieving an improvement of 9.3%in mAP@50 and 9.2%in F1-score.Moreover,the model is 56.6%and 32.4%smaller than YOLOv7-tiny and EfficientDet,respectively,while maintaining real-time inference speed at 238 frames per second(FPS).Validation on multiple benchmark datasets,including D-Fire,FIRESENSE,and BoWFire,further confirms its robustness and generalization ability,with detection accuracy consistently exceeding 82%.These results highlight the potential of LP-YOLO as a practical solution with high accuracy,robustness,and real-time performance for smoke and fire source detection. 展开更多
关键词 Deep learning smoke detection feature pyramid boundary refinement
在线阅读 下载PDF
Rapid improvement of seed weight and yield by combining QTL pyramiding with speed breeding in Brassica napus L
2
作者 Yixian Song Xin Yuan +5 位作者 Pengfei Wang Zhaoyang Wang Lei Kang Jing Yang Guangsheng Yang Dengfeng Hong 《Oil Crop Science》 2026年第1期83-91,共9页
Thousand-seed weight(TSW)is a critical target for genetic improvement in rapeseed(Brassica napus L.).However,phenotypic selection for this trait remains challenging due to its polygenic regulation by multiple quantita... Thousand-seed weight(TSW)is a critical target for genetic improvement in rapeseed(Brassica napus L.).However,phenotypic selection for this trait remains challenging due to its polygenic regulation by multiple quantitative trait loci(QTL).Here,six favorable TSW QTL alleles from two donor parents were introgress into an elite restorer line,621R,using an integrated strategy combining marker-assisted backcrossing and speed breeding protocols.Through six rounds of backcrossing and convergent crossing followed by two generations of selfing strategies,we developed 13 advanced lines with diverse TSW QTL combinations within 24 months.Field evaluations across three environments revealed that all lines exhibited significantly increased TSW in spring conditions(Minle,Gansu)and winter environments(Wuhan and Jiangling,Hubei)except for two lines which only showed increase in the spring environment.Hybridization assays using these lines as male parents crossed with two male-sterile lines(RG430A and 616A)demonstrated transgressive segregation for TSW:For RG430A-derived hybrids,all crosses significantly outperformed the original control(RG430A×621R)in Wuhan,with 8/13 and 9/13 crosses showing significant TSW increases in Minle and Jiangling,respectively.For 616A-derived hybrids,11/13 and 10/13 crosses exhibited significant TSW enhancement in Minle and Jiangling,compared to 3/13 in Wuhan.Notably,two top-performing hybrids achieved 13.0%and 6.8%higher plot yields,respectively.Our results demonstrate that strategic pyramiding of complementary TSW QTL alleles effectively enhances seed weight in rapeseed,and these improved lines represent valuable genetic resources for developing high-yield hybrids. 展开更多
关键词 Brassica napus L. Seed weight QTL pyramiding Speed breeding
在线阅读 下载PDF
Feature pyramid attention network for audio-visual scene classification 被引量:1
3
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Tension-compression asymmetry of pyramidal dislocations in magnesium 被引量:1
4
作者 Zikun Li Chuanlong Xu +3 位作者 Xiaobao Tian Wentao Jiang Qingyuan Wang Haidong Fan 《Journal of Magnesium and Alloys》 2025年第7期3198-3208,共11页
Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis com... Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis compressive loading and tensile loading.The Peierls stress of Py-Ⅰ dislocation shows strong tension-compression asymmetry.However,no tension-compression asymmetry is seen on the Py-Ⅱ dislocation and basal dislocation.The tension-compression asymmetry origins from the asymmetry of partial dislocations of Py-Ⅰ dislocation,which leads to the dislocation core contracted under c-axis compressive loading and expanded under tensile loading.By analyzing the forces acting on the partial dislocations,we defined a neutral direction,which deviates from the full dislocation Burgers vector by 70.3°.The neutral direction is dependent on the ratio of lattice stresses of partial dislocations.If the shear stress is applied along the neutral direction,tension-compression asymmetry is eliminated and the dislocation core is un-contracted/un-expanded.The neutral direction of symmetrical dislocations(Py-Ⅱ dislocation and basal dislocation)is just the full dislocation Burgers vector.The tension-compression asymmetry and dislocation core contraction/expansion have an important influence on the dislocation behaviors,such as cross-slip,decomposition,basaltransition and mobility,which can be used to explain the mechanical behaviors of Mg single-crystals compressed along c-axis. 展开更多
关键词 Tension-compression asymmetry pyramidal dislocation MAGNESIUM Molecular dynamics
在线阅读 下载PDF
Optimized Convolutional Neural Networks with Multi-Scale Pyramid Feature Integration for Efficient Traffic Light Detection in Intelligent Transportation Systems 被引量:1
5
作者 Yahia Said Yahya Alassaf +2 位作者 Refka Ghodhbani Taoufik Saidani Olfa Ben Rhaiem 《Computers, Materials & Continua》 2025年第2期3005-3018,共14页
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio... Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks. 展开更多
关键词 Intelligent transportation systems(ITS) traffic light detection multi-scale pyramid feature maps advanced driver assistance systems(ADAS) real-time detection AI in transportation
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
6
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
Understanding pyramidal slip-induced deformation bands and dynamic recrystallization in AZWX3100 magnesium alloy
7
作者 Risheng Pei Fatim-Zahra Mouhib +3 位作者 Mattis Seehaus Simon Arnoldi Pei-Ling Sun Talal Al-Samman 《Journal of Magnesium and Alloys》 2025年第3期1088-1098,共11页
Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temper... Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temperatures.This study investigates the DRX mechanisms in AZWX3100 magnesium alloy under plane strain compression at 200℃.Microstructural analysis revealed necklace-type DRX accompanied by evidence of local grain boundary bulging.Additionally,ribbons of recrystallized grains were observed withinfine deformation bands,aligned with theoretical pyramidal I and II slip traces derived from the matrix.The distribution of local misorientation within the deformed microstructure demonstrated a clear association between deformation bands and localized strain.Dislocation analysis of lamellar specimens extracted from two pyramidal slip bands revealed<c+a>dislocations,indicating a connection between<c+a>slip activation and the formation of deformation bands.Crystal plasticity simulations suggest that the orientation of deformation bands is responsible for the unique recrystallization texture of the DRX grains within these bands.The texture characteristics imply a progressive,glide-induced DRX mechanism.A fundamental understanding of the role of deformation bands in texture modification can facilitate future alloy and process design. 展开更多
关键词 MAGNESIUM Channel die Dynamic recrystallization Texture modification pyramidal slip
在线阅读 下载PDF
Super-Resolution Generative Adversarial Network with Pyramid Attention Module for Face Generation
8
作者 Parvathaneni Naga Srinivasu G.JayaLakshmi +4 位作者 Sujatha Canavoy Narahari Victor Hugo C.de Albuquerque Muhammad Attique Khan Hee-Chan Cho Byoungchol Chang 《Computers, Materials & Continua》 2025年第10期2117-2139,共23页
The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(... The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis. 展开更多
关键词 Artificial intelligence generative adversarial network pyramid attention module face generation deep learning
在线阅读 下载PDF
Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation
9
作者 Hui Luo Wenqing Li Wei Zeng 《Computers, Materials & Continua》 2025年第7期1567-1580,共14页
Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi... Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi-category,and multi-scale target segmentation tasks.To address these challenges,this paper proposes Pyramid-MixNet,an intelligent segmentation model for high-speed rail surface damage,leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms.The encoding net-work integrates Spatial Reduction Masked Multi-Head Attention(SRMMHA)to enhance global feature extraction while reducing trainable parameters.The decoding network incorporates Mix-Attention(MA),enabling multi-scale structural understanding and cross-scale token group correlation learning.Experimental results demonstrate that the proposed method achieves 62.17%average segmentation accuracy,80.28%Damage Dice Coefficient,and 56.83 FPS,meeting real-time detection requirements.The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage,offering practical value for real-time monitoring in high-speed railway maintenance systems. 展开更多
关键词 pyramid vision transformer encoder–decoder architecture railway damage segmentation masked multi-head attention mix-attention
在线阅读 下载PDF
Pyramid Pooling-Based Vision Transformer for Tool Condition Recognition
10
作者 ZHENG Kun LI Yonglin +2 位作者 GU Xinyan DING Zhiying ZHU Haihua 《Transactions of Nanjing University of Aeronautics and Astronautics》 2025年第3期322-336,共15页
This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Tradi... This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments.To address this,we implement real-time tool condition recognition by introducing deep learning technology.Aiming to the insufficient recognition accuracy,we propose a pyramid pooling-based vision Transformer network(P2ViT-Net)method for tool condition recognition.Using images as input effectively mitigates the issue of low-dimensional signal features.We enhance the vision Transformer(ViT)framework for image classification by developing the P2ViT model and adapt it to tool condition recognition.Experimental results demonstrate that our improved P2ViT model achieves 94.4%recognition accuracy,showing a 10%improvement over conventional ViT and outperforming all comparative convolutional neural network models. 展开更多
关键词 tool condition recognition TRANSFORMER pyramid pooling deep convolutional neural network
在线阅读 下载PDF
Scalable and Passive Concentrator Photovoltaics Using a Multi-Focal Pyramidal Array:A Multi-Physics Modeling Approach
11
作者 Mussad Mohammed Al-Zahrani Taher Maatallah 《Frontiers in Heat and Mass Transfer》 2025年第6期1883-1905,共23页
Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a com... Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a computational investigation of a novel Multi-Focal Pyramidal Array(MFPA)-based CPV system designed to overcome this limitation.The MFPA architecture employs a geometrically optimized pyramidal concentrator to distribute concen-trated sunlight onto strategically placed,low-cost monocrystalline silicon cells,enabling high efficiency energy capture while passively managing thermal loads.Coupled optical thermal electrical simulations in COMSOL Multiphysics demonstrate a geometric concentration ratio of 120×,with system temperatures maintained below 110℃ under standard 1000 W/m2 Direct Normal Irradiance(DNI).Ray tracing confirms 95%optical efficiency and a concentrated light spot radius of 2.48 mm.Compared with conventional CPV designs,the MFPA improves power-per-cost by 25%and reduces tracking requirements by 50%owing to its wide±15°acceptance angle.These results highlight the MFPA’s potential as a scalable,low-cost,and energy-efficient pathway for expanding solar power generation. 展开更多
关键词 Concentrating photovoltaic(CPV) multi-focal pyramidal array(MFPA) multi-physics simulation optical-thermal coupling geometric concentration solar energy conversion
在线阅读 下载PDF
A Coarse to Fine Thin Cloud Removal Network with Pyramid Non-local Attention
12
作者 GUAN Wang TIAN Zhenkai +5 位作者 MA Tao ZHAO Lingyuan XIE Shizhe YAN Jin DU Yang ZOU Yunkun 《Transactions of Nanjing University of Aeronautics and Astronautics》 2025年第5期589-600,共12页
In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively a... In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques. 展开更多
关键词 channel attention thin cloud removal network pyramid non-local attention(PNA) remote sensing image residual dense connection
在线阅读 下载PDF
Direct Hippocampal and Thalamic Inputs to Layer 3 Pyramidal Cells in the Medial Entorhinal Cortex Revealed by Monosynaptic Rabies Tracing
13
作者 Ze Chen Dietmar Schmitz John J.Tukker 《Neuroscience Bulletin》 2025年第4期707-712,共6页
Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connect... Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connectivity of this structure has been studied extensively over the past century,mainly using a range of anterograde and retrograde anatomical tracers[3]. 展开更多
关键词 medial entorhinal cortex mec HIPPOCAMPAL THALAMIC layer pyramidal cells connectivity structure spatial navigation anterograde retrograde anatomical tracers medial entorhinal cortex
原文传递
Hyperspectral Satellite Image Classification Based on Feature Pyramid Networks With 3D Convolution
14
作者 CHEN Cheng PENG Pan +1 位作者 TAO Wei ZHAO Hui 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1073-1084,共12页
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N... Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods. 展开更多
关键词 hyperspectral image(HSI) deep learning feature pyramid network(FPN) spectral-spatial feature extraction
原文传递
FEC-PVT:基于PVT架构的甲骨钻凿图像分割网络
15
作者 刘国奇 李文格 +3 位作者 茹琳媛 宋黎明 刘杰 韩燕彪 《河南师范大学学报(自然科学版)》 北大核心 2026年第1期8-16,I0003,共10页
由于长时间埋藏于地下和风化腐蚀,造成甲骨片破损和甲骨钻凿边界模糊不易分辨,给甲骨钻凿分割带来极大挑战.从甲骨数据库及著录书中系统收集并标注甲骨钻凿图像.基于该数据集,提出一种以Transformer为编码器的甲骨钻凿分割网络FEC-PVT(f... 由于长时间埋藏于地下和风化腐蚀,造成甲骨片破损和甲骨钻凿边界模糊不易分辨,给甲骨钻凿分割带来极大挑战.从甲骨数据库及著录书中系统收集并标注甲骨钻凿图像.基于该数据集,提出一种以Transformer为编码器的甲骨钻凿分割网络FEC-PVT(feature extraction and connection pyramid vision transformer).首先,FEC-PVT利用FE_C和FE_D模块分别补充低层和高层特征,以获取细节和全局特征;其次,FCOM模块用交叉注意力让不同层特征交互,获取有效细节;最后,FFDM模块逐层解码并整合多层次特征,提升解码精度,避免特征丢失.实验验证,所提FEC-PVT优于其他的方法,与次优的DuAT方法相比,IoU提高5.18%. 展开更多
关键词 图像分割 甲骨钻凿 金字塔视觉变换器 卷积神经网络
在线阅读 下载PDF
基于改进Faster R-CNN-FPN的田间劳作行为目标检测算法
16
作者 周艳青 邹铭鑫 +2 位作者 姜新华 白洁 马学磊 《内蒙古农业大学学报(自然科学版)》 北大核心 2026年第1期77-86,共10页
劳作行为检测时存在着检测精度不高和漏检等问题,利用Faster R-CNN和FPN提出一种改进的劳作行为检测模型。首先,在Faster R-CNN框架基础上,引入特征金字塔网络FPN,用于提高较小目标的检测能力。然后,为提高模型对不同尺度目标的泛化能力... 劳作行为检测时存在着检测精度不高和漏检等问题,利用Faster R-CNN和FPN提出一种改进的劳作行为检测模型。首先,在Faster R-CNN框架基础上,引入特征金字塔网络FPN,用于提高较小目标的检测能力。然后,为提高模型对不同尺度目标的泛化能力,加入多尺度MS训练;并利用内容感知特征重组CARAFE上采样算子替换FPN中的双线性插值上采样方式,实现大范围内像素的关联。最后,在自建的数据集FWBD上对改进的Faster R-CNN-FPN检测模型进行训练和测试。结果表明:(1)与YOLOv3模型相比,改进的劳作行为识别算法mAP为69.40%;(2)与原始模型Faster、Faster-CARAFER、Faster-MS相比,改进的算法模型mAP值最高,达到了71.05%,说明改进的算法模型能有效地实现田间劳作行为的检测,对农业生产实践具有实际应用价值。 展开更多
关键词 田间劳作 行为检测 Faster R-CNN 特征金字塔网络 内容感知特征重组
原文传递
改进型金字塔解析网络在矿区浮石识别中的应用研究
17
作者 郭梨 吴昊 +2 位作者 顾清华 黄智奇 李照康 《安全与环境学报》 北大核心 2026年第4期1307-1315,共9页
为解决露天矿区复杂环境下浮石识别与特征提取的技术难题,提出了一种基于动态卷积改进型金字塔解析网络模型的浮石检测与三维重建方法。针对浮石区域分布复杂、小目标特征难以捕获的问题,优化金字塔解析网络模型,引入动态卷积和通道注... 为解决露天矿区复杂环境下浮石识别与特征提取的技术难题,提出了一种基于动态卷积改进型金字塔解析网络模型的浮石检测与三维重建方法。针对浮石区域分布复杂、小目标特征难以捕获的问题,优化金字塔解析网络模型,引入动态卷积和通道注意力机制,增强特征提取能力和分割精度;结合ZED双目相机获取的深度图,与分割结果生成三维点云模型,通过凸包算法实现浮石几何特征和体积的精准测量。在自制浮石数据集上的试验表明,改进模型的mIoU达到81.94%,比原始模型提升5.49百分点;mDice和mPA分别提升至90.90%和89.98%;体积、质量、形状因子和初始高度识别的平均准确率分别达到87.65%、85.55%、82.38%和96.15%。研究表明,改进的金字塔解析网络模型在复杂矿区环境中展现出卓越的浮石检测与几何信息提取能力,为矿区浮石清理与安全评估提供了可靠的技术支持。 展开更多
关键词 安全工程 深度学习 金字塔解析网络 浮石检测 三维重建
原文传递
自适应特征的轻量化路面裂缝检测方法
18
作者 刘媛媛 朱凯 +3 位作者 顾志辉 岳猛 王靖智 朱路 《光学精密工程》 北大核心 2026年第2期336-351,共16页
针对路面裂缝形态复杂、易受环境干扰,且检测存在精度与轻量化不平衡等问题,本文提出一种自适应特征的轻量化路面裂缝检测方法。首先,根据裂缝狭长且跨度大的特点,设计了裂缝高效注意力机制,通过压缩特征维度,以捕捉其长距离空间依赖。... 针对路面裂缝形态复杂、易受环境干扰,且检测存在精度与轻量化不平衡等问题,本文提出一种自适应特征的轻量化路面裂缝检测方法。首先,根据裂缝狭长且跨度大的特点,设计了裂缝高效注意力机制,通过压缩特征维度,以捕捉其长距离空间依赖。其次,构建动态采样金字塔进行自适应采样和提取目标特征,以增强对异构裂缝特征的表示能力。然后,改进HGNet_GS轻量化主干网络,并提出了轻量化检测头,显著降低了计算冗余;采用Powerful IoU损失函数解决框锚膨胀问题并提升小模型的收敛速度。此外为验证模型泛化性,自建了民用路面缺陷数据集,其中包含不同光照条件下路面缺陷共计2985张。实验结果表明,与基准模型YOLOv8n相比,本文模型参数量和计算量分别减少了50%和52%。在自建数据集上,mAP50和mAP95分别提升了5.4%和4.1%;在公开的RDD2022数据集上,mAP50和mAP95分别提升了2.1%和1.5%。该模型已应用于边缘设备并完成工程作业测试,验证了其能够满足轻量化路面裂缝检测的工程应用需求,为自动化道路维护提供了技术方案。 展开更多
关键词 路面裂缝检测 注意力机制 轻量化 动态采样金字塔 YOLOv8
在线阅读 下载PDF
基于MobileNetV4-DSFPN的芍药田间机器人视觉导航
19
作者 徐善永 邢雪景 +1 位作者 程军辉 张俊卿 《农机化研究》 北大核心 2026年第8期169-178,共10页
精准分割田间可行驶区域并实时提取导航线是实现农业机器人在田间自主作业的关键环节。针对芍药田间背景复杂、现有语义模型计算复杂度高、实时性差等问题,提出一种轻量化的MobileNetV4-DSFPN语义分割模型,采用改进MobileNetV4作为高效... 精准分割田间可行驶区域并实时提取导航线是实现农业机器人在田间自主作业的关键环节。针对芍药田间背景复杂、现有语义模型计算复杂度高、实时性差等问题,提出一种轻量化的MobileNetV4-DSFPN语义分割模型,采用改进MobileNetV4作为高效编码器,显著压缩了参数量,并降低了计算复杂度。解码器部分构建了基于深度可分离卷积的轻量级特征金字塔网络(DSFPN),通过横向连接与转置卷积上采样,实现了高效的多尺度特征融合。基于分割结果,通过形态学优化、鲁棒边界点检测、自适应多项式拟合生成导航线。试验表明,使用数据增强可以显著提升模型预测精度和泛化能力。消融实验表明:改进MobileNetV4编码器在精度与效率的平衡上优于其他轻量网络,DSFPN解码器在精度与标准特征金字塔网络相近的同时,参数量与计算量分别降低了38.5%、29.3%。在芍药田间路径数据集上与多种分割模型进行对比,结果表明:在精度与效率上取得了最佳平衡,类别平均像素准确率(mPA)和平均交并比(mIoU)分别达到97.16%、94.11%;导航线的平均横向偏差为0.231像素,平均角度偏差为0.842°;将算法部署于车载计算机上平均推理速度达到23.1 FPS,满足导航对实时性和准确性的要求。 展开更多
关键词 芍药田间机器人 视觉导航 MobileNetV4 特征金字塔网络 轻量化 语义分割
在线阅读 下载PDF
融合手物特征的三维手部姿态估计网络
20
作者 贾迪 王建淳 +2 位作者 韩雪峰 张藩 王骁 《中国图象图形学报》 北大核心 2026年第2期628-641,共14页
目的 手部姿态估计作为人机交互的核心感知技术,在复杂交互场景下,面临多尺度特征融合过程中信道丢失以及手物交互过程中遮挡干扰等挑战,现有方法多依赖单一特征提取策略或静态注意力机制,无法同时兼顾精度与鲁棒性。为此,提出一种融合... 目的 手部姿态估计作为人机交互的核心感知技术,在复杂交互场景下,面临多尺度特征融合过程中信道丢失以及手物交互过程中遮挡干扰等挑战,现有方法多依赖单一特征提取策略或静态注意力机制,无法同时兼顾精度与鲁棒性。为此,提出一种融合手物特征的三维手部姿态估计网络(hand object collaborative enhancement network,HOCEN),旨在通过多层次跨模态交互优化,提升遮挡场景下的手部姿态估计性能。方法 设计双流特征金字塔,通过双向跨尺度信息聚合捕获局部细节与全局语义依赖,缓解传统特征金字塔的通道信息丢失问题;给出一种基于外部注意力的动态调整模块,对提取后的手部特征进行动态注意力权重分配,抑制噪声干扰;构建双流协同注意力机制,结合手—物几何约束与语义互补特性,增强跨模态特征对齐能力;通过层级特征解码器重构精准的手部姿态参数。结果 在Dex-YCB(dexterous-YCB)与HO3D(hand-object 3D)数据集上的实验结果表明,本文方法在遮挡场景下的手部关节定位精度高于当前的主流模型,在Dex-YCB数据集上,手部姿态估计指标MPJPE(mean per joint position error)和PA-MPJPE(procrustes-aligned mean per joint position error)分别达到12.4 mm和5.4 mm,均优于SemGCN(semantic graph convolutional network)、HFL(harmonious feature learning)等先进模型,在HO3D数据集上,手部姿态估计指标Join和Mesh上分别达到9.2 mm和9.1 mm,实现了极低的误差。此外,在Dex-YCB与HO3D数据集上分别进行消融实验,进一步证明各模块在手—物协同估计指标上的独立贡献与协同增益。结论 本文提出一种基于动态手物交互特征融合的三维手部姿态估计网络架构,通过跨模态特征协同建模机制有效提升姿态估计精度。实验结果表明,本文方法在复杂交互场景下具有较高的鲁棒性与泛化能力,提出的动态特征校准与手物协同策略为提升遮挡场景下的手部姿态估计提供了全新的解决方案。 展开更多
关键词 手势姿态估计 双流金字塔特征融合 动态注意力机制 动态特征调整 手物协同增强
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部