期刊文献+
共找到5,903篇文章
< 1 2 250 >
每页显示 20 50 100
Feature pyramid attention network for audio-visual scene classification 被引量:1
1
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
Double Self-Attention Based Fully Connected Feature Pyramid Network for Field Crop Pest Detection
2
作者 Zijun Gao Zheyi Li +2 位作者 Chunqi Zhang Ying Wang Jingwen Su 《Computers, Materials & Continua》 2025年第6期4353-4371,共19页
Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of intersp... Pest detection techniques are helpful in reducing the frequency and scale of pest outbreaks;however,their application in the actual agricultural production process is still challenging owing to the problems of interspecies similarity,multi-scale,and background complexity of pests.To address these problems,this study proposes an FD-YOLO pest target detection model.The FD-YOLO model uses a Fully Connected Feature Pyramid Network(FC-FPN)instead of a PANet in the neck,which can adaptively fuse multi-scale information so that the model can retain small-scale target features in the deep layer,enhance large-scale target features in the shallow layer,and enhance the multiplexing of effective features.A dual self-attention module(DSA)is then embedded in the C3 module of the neck,which captures the dependencies between the information in both spatial and channel dimensions,effectively enhancing global features.We selected 16 types of pests that widely damage field crops in the IP102 pest dataset,which were used as our dataset after data supplementation and enhancement.The experimental results showed that FD-YOLO’s mAP@0.5 improved by 6.8%compared to YOLOv5,reaching 82.6%and 19.1%–5%better than other state-of-the-art models.This method provides an effective new approach for detecting similar or multiscale pests in field crops. 展开更多
关键词 Pest detection YOLOv5 feature pyramid network transformer attention module
在线阅读 下载PDF
Tension-compression asymmetry of pyramidal dislocations in magnesium
3
作者 Zikun Li Chuanlong Xu +3 位作者 Xiaobao Tian Wentao Jiang Qingyuan Wang Haidong Fan 《Journal of Magnesium and Alloys》 2025年第7期3198-3208,共11页
Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis com... Pyramidal dislocations are important for ductility enhancement of magnesium alloys.In this work,molecular dynamics simulations were employed to study the gliding behavior of pyramidal(c+a)dislocations under c-axis compressive loading and tensile loading.The Peierls stress of Py-Ⅰ dislocation shows strong tension-compression asymmetry.However,no tension-compression asymmetry is seen on the Py-Ⅱ dislocation and basal dislocation.The tension-compression asymmetry origins from the asymmetry of partial dislocations of Py-Ⅰ dislocation,which leads to the dislocation core contracted under c-axis compressive loading and expanded under tensile loading.By analyzing the forces acting on the partial dislocations,we defined a neutral direction,which deviates from the full dislocation Burgers vector by 70.3°.The neutral direction is dependent on the ratio of lattice stresses of partial dislocations.If the shear stress is applied along the neutral direction,tension-compression asymmetry is eliminated and the dislocation core is un-contracted/un-expanded.The neutral direction of symmetrical dislocations(Py-Ⅱ dislocation and basal dislocation)is just the full dislocation Burgers vector.The tension-compression asymmetry and dislocation core contraction/expansion have an important influence on the dislocation behaviors,such as cross-slip,decomposition,basaltransition and mobility,which can be used to explain the mechanical behaviors of Mg single-crystals compressed along c-axis. 展开更多
关键词 Tension-compression asymmetry pyramidal dislocation MAGNESIUM Molecular dynamics
在线阅读 下载PDF
Understanding pyramidal slip-induced deformation bands and dynamic recrystallization in AZWX3100 magnesium alloy
4
作者 Risheng Pei Fatim-Zahra Mouhib +3 位作者 Mattis Seehaus Simon Arnoldi Pei-Ling Sun Talal Al-Samman 《Journal of Magnesium and Alloys》 2025年第3期1088-1098,共11页
Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temper... Dynamic recrystallization(DRX)in inhomogeneous deformation zones,such as grain boundaries,shear bands,and deformation bands,is critical for texture modification in magnesium alloys during deformation at elevated temperatures.This study investigates the DRX mechanisms in AZWX3100 magnesium alloy under plane strain compression at 200℃.Microstructural analysis revealed necklace-type DRX accompanied by evidence of local grain boundary bulging.Additionally,ribbons of recrystallized grains were observed withinfine deformation bands,aligned with theoretical pyramidal I and II slip traces derived from the matrix.The distribution of local misorientation within the deformed microstructure demonstrated a clear association between deformation bands and localized strain.Dislocation analysis of lamellar specimens extracted from two pyramidal slip bands revealed<c+a>dislocations,indicating a connection between<c+a>slip activation and the formation of deformation bands.Crystal plasticity simulations suggest that the orientation of deformation bands is responsible for the unique recrystallization texture of the DRX grains within these bands.The texture characteristics imply a progressive,glide-induced DRX mechanism.A fundamental understanding of the role of deformation bands in texture modification can facilitate future alloy and process design. 展开更多
关键词 Magnesium Channel die Dynamic recrystallization Texture modification pyramidal slip
在线阅读 下载PDF
Super-Resolution Generative Adversarial Network with Pyramid Attention Module for Face Generation
5
作者 Parvathaneni Naga Srinivasu G.JayaLakshmi +4 位作者 Sujatha Canavoy Narahari Victor Hugo C.de Albuquerque Muhammad Attique Khan Hee-Chan Cho Byoungchol Chang 《Computers, Materials & Continua》 2025年第10期2117-2139,共23页
The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(... The generation of high-quality,realistic face generation has emerged as a key field of research in computer vision.This paper proposes a robust approach that combines a Super-Resolution Generative Adversarial Network(SRGAN)with a Pyramid Attention Module(PAM)to enhance the quality of deep face generation.The SRGAN framework is designed to improve the resolution of generated images,addressing common challenges such as blurriness and a lack of intricate details.The Pyramid Attention Module further complements the process by focusing on multi-scale feature extraction,enabling the network to capture finer details and complex facial features more effectively.The proposed method was trained and evaluated over 100 epochs on the CelebA dataset,demonstrating consistent improvements in image quality and a marked decrease in generator and discriminator losses,reflecting the model’s capacity to learn and synthesize high-quality images effectively,given adequate computational resources.Experimental outcome demonstrates that the SRGAN model with PAM module has outperformed,yielding an aggregate discriminator loss of 0.055 for real,0.043 for fake,and a generator loss of 10.58 after training for 100 epochs.The model has yielded an structural similarity index measure of 0.923,that has outperformed the other models that are considered in the current study for analysis. 展开更多
关键词 Artificial intelligence generative adversarial network pyramid attention module face generation deep learning
在线阅读 下载PDF
Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation
6
作者 Hui Luo Wenqing Li Wei Zeng 《Computers, Materials & Continua》 2025年第7期1567-1580,共14页
Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi... Rail surface damage is a critical component of high-speed railway infrastructure,directly affecting train operational stability and safety.Existing methods face limitations in accuracy and speed for small-sample,multi-category,and multi-scale target segmentation tasks.To address these challenges,this paper proposes Pyramid-MixNet,an intelligent segmentation model for high-speed rail surface damage,leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms.The encoding net-work integrates Spatial Reduction Masked Multi-Head Attention(SRMMHA)to enhance global feature extraction while reducing trainable parameters.The decoding network incorporates Mix-Attention(MA),enabling multi-scale structural understanding and cross-scale token group correlation learning.Experimental results demonstrate that the proposed method achieves 62.17%average segmentation accuracy,80.28%Damage Dice Coefficient,and 56.83 FPS,meeting real-time detection requirements.The model’s high accuracy and scene adaptability significantly improve the detection of small-scale and complex multi-scale rail damage,offering practical value for real-time monitoring in high-speed railway maintenance systems. 展开更多
关键词 pyramid vision transformer encoder–decoder architecture railway damage segmentation masked multi-head attention mix-attention
在线阅读 下载PDF
Pyramid Pooling-Based Vision Transformer for Tool Condition Recognition
7
作者 ZHENG Kun LI Yonglin +2 位作者 GU Xinyan DING Zhiying ZHU Haihua 《Transactions of Nanjing University of Aeronautics and Astronautics》 2025年第3期322-336,共15页
This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Tradi... This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments.To address this,we implement real-time tool condition recognition by introducing deep learning technology.Aiming to the insufficient recognition accuracy,we propose a pyramid pooling-based vision Transformer network(P2ViT-Net)method for tool condition recognition.Using images as input effectively mitigates the issue of low-dimensional signal features.We enhance the vision Transformer(ViT)framework for image classification by developing the P2ViT model and adapt it to tool condition recognition.Experimental results demonstrate that our improved P2ViT model achieves 94.4%recognition accuracy,showing a 10%improvement over conventional ViT and outperforming all comparative convolutional neural network models. 展开更多
关键词 tool condition recognition TRANSFORMER pyramid pooling deep convolutional neural network
在线阅读 下载PDF
Scalable and Passive Concentrator Photovoltaics Using a Multi-Focal Pyramidal Array:A Multi-Physics Modeling Approach
8
作者 Mussad Mohammed Al-Zahrani Taher Maatallah 《Frontiers in Heat and Mass Transfer》 2025年第6期1883-1905,共23页
Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a com... Conventional concentrator photovoltaics(CPV)face a persistent trade-off between high efficiency and high cost,driven by expensive multi-junction solar cells and complex active cooling systems.This study presents a computational investigation of a novel Multi-Focal Pyramidal Array(MFPA)-based CPV system designed to overcome this limitation.The MFPA architecture employs a geometrically optimized pyramidal concentrator to distribute concen-trated sunlight onto strategically placed,low-cost monocrystalline silicon cells,enabling high efficiency energy capture while passively managing thermal loads.Coupled optical thermal electrical simulations in COMSOL Multiphysics demonstrate a geometric concentration ratio of 120×,with system temperatures maintained below 110℃ under standard 1000 W/m2 Direct Normal Irradiance(DNI).Ray tracing confirms 95%optical efficiency and a concentrated light spot radius of 2.48 mm.Compared with conventional CPV designs,the MFPA improves power-per-cost by 25%and reduces tracking requirements by 50%owing to its wide±15°acceptance angle.These results highlight the MFPA’s potential as a scalable,low-cost,and energy-efficient pathway for expanding solar power generation. 展开更多
关键词 Concentrating photovoltaic(CPV) multi-focal pyramidal array(MFPA) multi-physics simulation optical-thermal coupling geometric concentration solar energy conversion
在线阅读 下载PDF
A Coarse to Fine Thin Cloud Removal Network with Pyramid Non-local Attention
9
作者 GUAN Wang TIAN Zhenkai +5 位作者 MA Tao ZHAO Lingyuan XIE Shizhe YAN Jin DU Yang ZOU Yunkun 《Transactions of Nanjing University of Aeronautics and Astronautics》 2025年第5期589-600,共12页
In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively a... In remote sensing imagery,approximately 67%of the data are affected by cloud cover,significantly increasing the difficulty of image classification,recognition,and other downstream interpretation tasks.To effectively address the randomness of cloud distribution and the non-uniformity of cloud thickness,we propose a coarse-to-fine thin cloud removal architecture based on the observations of the random distribution and uneven thickness of cloud.In the coarse-level declouding network,we innovatively introduce a multi-scale attention mechanism,i.e.,pyramid nonlocal attention(PNA).By integrating global context with local detail information,it specifically addresses image quality degradation caused by the uncertainty in cloud distribution.During the fine-level declouding stage,we focus on the impact of cloud thickness on declouding results(primarily manifested as insufficient detail information).Through a carefully designed residual dense module,we significantly enhance the extraction and utilization of feature details.Thus,our approach precisely restores lost local texture features on top of coarse-level results,achieving a substantial leap in declouding quality.To evaluate the effectiveness of our cloud removal technology and attention mechanism,we conducted comprehensive analyses on publicly available datasets.Results demonstrate that our method achieves state-of-the-art performance across a wide range of techniques. 展开更多
关键词 channel attention thin cloud removal network pyramid non-local attention(PNA) remote sensing image residual dense connection
在线阅读 下载PDF
Direct Hippocampal and Thalamic Inputs to Layer 3 Pyramidal Cells in the Medial Entorhinal Cortex Revealed by Monosynaptic Rabies Tracing
10
作者 Ze Chen Dietmar Schmitz John J.Tukker 《Neuroscience Bulletin》 2025年第4期707-712,共6页
Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connect... Dear Editor,The importance of the medial entorhinal cortex(MEC)for memory and spatial navigation has been shown repeatedly in many species,including mice and humans[1,2].It is,therefore,not surprising that the connectivity of this structure has been studied extensively over the past century,mainly using a range of anterograde and retrograde anatomical tracers[3]. 展开更多
关键词 medial entorhinal cortex mec HIPPOCAMPAL THALAMIC layer pyramidal cells connectivity structure spatial navigation anterograde retrograde anatomical tracers medial entorhinal cortex
原文传递
Hyperspectral Satellite Image Classification Based on Feature Pyramid Networks With 3D Convolution
11
作者 CHEN Cheng PENG Pan +1 位作者 TAO Wei ZHAO Hui 《Journal of Shanghai Jiaotong university(Science)》 2025年第6期1073-1084,共12页
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N... Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods. 展开更多
关键词 hyperspectral image(HSI) deep learning feature pyramid network(FPN) spectral-spatial feature extraction
原文传递
Optimized Convolutional Neural Networks with Multi-Scale Pyramid Feature Integration for Efficient Traffic Light Detection in Intelligent Transportation Systems
12
作者 Yahia Said Yahya Alassaf +2 位作者 Refka Ghodhbani Taoufik Saidani Olfa Ben Rhaiem 《Computers, Materials & Continua》 2025年第2期3005-3018,共14页
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio... Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks. 展开更多
关键词 Intelligent transportation systems(ITS) traffic light detection multi-scale pyramid feature maps advanced driver assistance systems(ADAS) real-time detection AI in transportation
在线阅读 下载PDF
FEC-PVT:基于PVT架构的甲骨钻凿图像分割网络
13
作者 刘国奇 李文格 +3 位作者 茹琳媛 宋黎明 刘杰 韩燕彪 《河南师范大学学报(自然科学版)》 北大核心 2026年第1期8-16,I0003,共10页
由于长时间埋藏于地下和风化腐蚀,造成甲骨片破损和甲骨钻凿边界模糊不易分辨,给甲骨钻凿分割带来极大挑战.从甲骨数据库及著录书中系统收集并标注甲骨钻凿图像.基于该数据集,提出一种以Transformer为编码器的甲骨钻凿分割网络FEC-PVT(f... 由于长时间埋藏于地下和风化腐蚀,造成甲骨片破损和甲骨钻凿边界模糊不易分辨,给甲骨钻凿分割带来极大挑战.从甲骨数据库及著录书中系统收集并标注甲骨钻凿图像.基于该数据集,提出一种以Transformer为编码器的甲骨钻凿分割网络FEC-PVT(feature extraction and connection pyramid vision transformer).首先,FEC-PVT利用FE_C和FE_D模块分别补充低层和高层特征,以获取细节和全局特征;其次,FCOM模块用交叉注意力让不同层特征交互,获取有效细节;最后,FFDM模块逐层解码并整合多层次特征,提升解码精度,避免特征丢失.实验验证,所提FEC-PVT优于其他的方法,与次优的DuAT方法相比,IoU提高5.18%. 展开更多
关键词 图像分割 甲骨钻凿 金字塔视觉变换器 卷积神经网络
在线阅读 下载PDF
自适应特征的轻量化路面裂缝检测方法
14
作者 刘媛媛 朱凯 +3 位作者 顾志辉 岳猛 王靖智 朱路 《光学精密工程》 北大核心 2026年第2期336-351,共16页
针对路面裂缝形态复杂、易受环境干扰,且检测存在精度与轻量化不平衡等问题,本文提出一种自适应特征的轻量化路面裂缝检测方法。首先,根据裂缝狭长且跨度大的特点,设计了裂缝高效注意力机制,通过压缩特征维度,以捕捉其长距离空间依赖。... 针对路面裂缝形态复杂、易受环境干扰,且检测存在精度与轻量化不平衡等问题,本文提出一种自适应特征的轻量化路面裂缝检测方法。首先,根据裂缝狭长且跨度大的特点,设计了裂缝高效注意力机制,通过压缩特征维度,以捕捉其长距离空间依赖。其次,构建动态采样金字塔进行自适应采样和提取目标特征,以增强对异构裂缝特征的表示能力。然后,改进HGNet_GS轻量化主干网络,并提出了轻量化检测头,显著降低了计算冗余;采用Powerful IoU损失函数解决框锚膨胀问题并提升小模型的收敛速度。此外为验证模型泛化性,自建了民用路面缺陷数据集,其中包含不同光照条件下路面缺陷共计2985张。实验结果表明,与基准模型YOLOv8n相比,本文模型参数量和计算量分别减少了50%和52%。在自建数据集上,mAP50和mAP95分别提升了5.4%和4.1%;在公开的RDD2022数据集上,mAP50和mAP95分别提升了2.1%和1.5%。该模型已应用于边缘设备并完成工程作业测试,验证了其能够满足轻量化路面裂缝检测的工程应用需求,为自动化道路维护提供了技术方案。 展开更多
关键词 路面裂缝检测 注意力机制 轻量化 动态采样金字塔 YOLOv8
在线阅读 下载PDF
面向原煤分选场景的多模态融合异物开集检测方法
15
作者 曹现刚 刘航 +2 位作者 刘家辉 吴旭东 王鹏 《煤炭科学技术》 北大核心 2026年第1期464-474,共11页
原煤分选过程首先需要对大块矸石、铁丝、编织袋等异物进行识别与拣选,以避免对后续工艺环节造成影响或引发安全事故。目前煤炭异物目标检测算法主要是面向已知对象的检测算法,对未知目标,尤其是各类锚杆、新式支护材料等具有复杂外观... 原煤分选过程首先需要对大块矸石、铁丝、编织袋等异物进行识别与拣选,以避免对后续工艺环节造成影响或引发安全事故。目前煤炭异物目标检测算法主要是面向已知对象的检测算法,对未知目标,尤其是各类锚杆、新式支护材料等具有复杂外观与语义不确定目标的检测能力不足,亟须研究能够同时具备已知与未知异物检测能力的目标检测模型。提出了一种基于多模态融合的煤炭异物开集检测方法。首先,基于DINO网络,设计了文本与图像的双模态特征信息提取架构,以获取更具类别判别性的文本与视觉特征,引入路径聚合特征金字塔网络,采用多层特征抽取策略,将深层语义特征与浅层空间细节有效结合,强化对小尺度煤炭异物的感知能力,提升检测精度;其次,构建了基于自注意力机制与交叉注意力机制的多模态特征融合模块,实现文本与视觉特征的深度交互与高效融合,并引入基于语言引导的查询选择机制,使任意类别文本描述与视觉查询建立对应关系,从而提升特征语义一致性与跨类别泛化能力;最后,设计了一种基于视觉-文本多模态解码模块,在每层查询更新阶段插入文本引导机制,使可学习查询在与图像特征交互前对齐语言特征,有效提升多模态特征对齐的准确性与鲁棒性。基于自建煤炭异物数据集构建多类别组合的开放动态环境,并系统开展了试验,结果表明本文方法在已知类别检测不同开放度任务中mAP@0.5精度均优于其他对比方法,在未知类别检测不同开放度任务中,未知类召回率分别达到41.24%、52.26%、57.13%,验证了零样本条件下的有效性。本文方法具备针对未知类别煤炭异物的检测能力,为煤炭异物的开集检测提供了有效的技术支撑。 展开更多
关键词 煤炭异物 多模态融合 开集检测 特征金字塔 特征语义一致性
在线阅读 下载PDF
动态场景下基于跨域掩膜分割的视觉SLAM算法
16
作者 亢洁 徐婷 +4 位作者 王佳乐 郭进 赫轩 王沫 夏宇 《陕西科技大学学报》 北大核心 2026年第1期178-185,193,共9页
针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减... 针对动态场景下视觉SLAM(Simultaneous Localization and Mapping)系统中深度学习分割网络实时性不足,以及相机非期望运动导致位姿估计偏差的问题,提出一种基于跨域掩膜分割的视觉SLAM算法.该算法采用轻量化YOLO-fastest网络结合背景减除法实现运动物体检测,利用深度图结合深度阈值分割构建跨域掩膜分割机制,并设计相机运动几何校正策略补偿检测框坐标误差,在实现运动物体分割的同时提升处理速度.为优化特征点利用率,采用金字塔光流对动态特征点进行帧间连续跟踪与更新,同时确保仅由静态特征点参与位姿估计过程.在TUM数据集上进行系统性评估,实验结果表明,相比于ORB-SLAM3算法,该算法的绝对位姿误差平均降幅达97.1%,与使用深度学习分割网络的DynaSLAM和DS-SLAM的动态SLAM算法相比,其单帧跟踪时间大幅减少,在精度与效率之间实现了更好的平衡. 展开更多
关键词 视觉SLAM 动态场景 YOLO-Fastest 金字塔光流 深度阈值分割
在线阅读 下载PDF
基于多注意力机制的脊柱病灶MRI影像识别模型
17
作者 周慧 宋新景 《计算机科学与探索》 北大核心 2026年第1期291-300,共10页
人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位... 人工检测脊柱病变是一项耗时的工作,并且高度依赖于该领域的专家,因此脊柱病灶的自动识别是非常必要的。然而,因为脊柱病灶的大小、位置和结构存在着广泛的差异,同时脊柱肿瘤与稀有病布鲁氏菌在影像上高度相似,所以脊柱病灶的准确定位和分类是一项具有挑战性的工作。为了应对这些挑战,提出了一种改进的脊柱病灶MRI影像识别模型。引入以ResNet-101为基础的双向特征金字塔主干网络,利用可变卷积在不同层替代传统的卷积神经网络,从特征层中获得更多的特征信息。在不同的模块中加入了多重注意力,包括自注意力机制和柔性注意力机制,有效地融合特征中贡献较大的部分。为了克服脊柱肿瘤、感染性病变、稀有病布鲁氏菌的数据不平衡问题,引入了改进的平衡交叉熵损失函数。在大连某医院提供的临床数据集上进行验证,识别精确率达到了94.2%,识别召回率达到90.8%。与其他识别模型进行对比实验,结果说明了该方法相对于其他模型识别性能更好。 展开更多
关键词 脊柱病灶识别 双向特征金字塔 多注意力机制 可变卷积 多特征融合
在线阅读 下载PDF
基于空间通道自适应特征的肝脏病理图像分割网络
18
作者 王建宇 王朝立 +1 位作者 孙占全 刘晓虹 《电子科技》 2026年第1期9-17,共9页
针对肝脏病理图像中病变区域与周围组织相似度高、对比度低以及边界模糊等问题,文中提出了一个基于空间通道自适应特征的肝脏病理分割网络。通过混合校准注意力使网络能够自适应地选择经空间和通道校准过的特征信息,有利于编码器捕获与... 针对肝脏病理图像中病变区域与周围组织相似度高、对比度低以及边界模糊等问题,文中提出了一个基于空间通道自适应特征的肝脏病理分割网络。通过混合校准注意力使网络能够自适应地选择经空间和通道校准过的特征信息,有利于编码器捕获与肝脏病灶相关的重要特征,并在编码器最深层引入空洞空间金字塔池化模块来弥补高级特征所缺失的多尺度信息,提高模型的分割精度。在私有肝脏数据集、公开肝脏数据集以及其他两种公开病理数据集对所提网络进行对比实验和消融实验。实验结果表明,相较于其他方法,所提网络的分割结果较佳,且有效解决了肝细胞癌分割问题。 展开更多
关键词 肝细胞癌 病理图像 编解码架构 混合校准注意力模块 空间注意力 通道注意力 空洞空间金字塔池化模块 多尺度信息
在线阅读 下载PDF
Steerable Pyramid分解在储层断裂检测中的应用 被引量:4
19
作者 林春 王绪本 +1 位作者 刘四兵 刘力辉 《地球物理学进展》 CSCD 北大核心 2012年第1期279-287,共9页
在隐蔽油气藏勘探中,正确刻画储层断裂对于油气勘探和开发都有着至关重要的作用.储层断裂在地震数据中表现为边缘特征.但现今多尺度边缘检测的基本理论和方法都有其本身的局限性.本文详细阐述了Steerable Pyramid分解用于储层断裂检测... 在隐蔽油气藏勘探中,正确刻画储层断裂对于油气勘探和开发都有着至关重要的作用.储层断裂在地震数据中表现为边缘特征.但现今多尺度边缘检测的基本理论和方法都有其本身的局限性.本文详细阐述了Steerable Pyramid分解用于储层断裂检测的基本原理和方法.输入的三维地震切片经过Steerable Pyramid分解后,不仅可以分析不同尺度不同方向上断裂的特征及走向,还可以通过重构得到经S函数控制的断裂信息,该断裂信息不仅仅是断裂的边缘检测,而是增强的断裂本身.将该算法应用于实际地震切片的断裂检测,得到了清晰的断裂信息,完整细致地展现了这些地质类型的特征,为构造解释、地震相表征及储层预测提供了有用信息. 展开更多
关键词 Steerable pyramid分解 极坐标滤波器 基本方向滤波器 断裂检测
在线阅读 下载PDF
分布式存储系统中基于Pyramid码的局部性修复编码 被引量:5
20
作者 王静 张崇 +1 位作者 梁伟 刘向阳 《电子测量与仪器学报》 CSCD 北大核心 2017年第9期1481-1487,共7页
为了提高分布式存储系统的存储可靠性以及故障节点的修复效率,提出一种基于Pyramid码的局部性修复编码方案。该编码方案采用Pyramid码的最小可实现编码结构,划分局部修复组,确保较低的修复局部性并实现故障节点的快速修复。性能分析表明... 为了提高分布式存储系统的存储可靠性以及故障节点的修复效率,提出一种基于Pyramid码的局部性修复编码方案。该编码方案采用Pyramid码的最小可实现编码结构,划分局部修复组,确保较低的修复局部性并实现故障节点的快速修复。性能分析表明,基于Pyramid码的局部性修复编码可实现存储系统中多个故障节点的快速修复,具有较低的修复局部性,且相对于三副本复制策略以及简单再生码,基于Pyramid码的局部性修复编码在存储开销和修复带宽开销方面的性能更优。 展开更多
关键词 分布式存储系统 pyramid 再生码 局部性修复编码
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部