期刊文献+
共找到489篇文章
< 1 2 25 >
每页显示 20 50 100
Enhanced Multimodal Sentiment Analysis via Integrated Spatial Position Encoding and Fusion Embedding
1
作者 Chenquan Gan Xu Liu +3 位作者 Yu Tang Xianrong Yu Qingyi Zhu Deepak Kumar Jain 《Computers, Materials & Continua》 2025年第12期5399-5421,共23页
Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the vary... Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the varying importance of each modality across different contexts,a central and pressing challenge in multimodal sentiment analysis lies in maximizing the use of rich intra-modal features while minimizing information loss during the fusion process.In response to these critical limitations,we propose a novel framework that integrates spatial position encoding and fusion embedding modules to address these issues.In our model,text is treated as the core modality,while speech and video features are selectively incorporated through a unique position-aware fusion process.The spatial position encoding strategy preserves the internal structural information of speech and visual modalities,enabling the model to capture localized intra-modal dependencies that are often overlooked.This design enhances the richness and discriminative power of the fused representation,enabling more accurate and context-aware sentiment prediction.Finally,we conduct comprehensive evaluations on two widely recognized standard datasets in the field—CMU-MOSI and CMU-MOSEI to validate the performance of the proposed model.The experimental results demonstrate that our model exhibits good performance and effectiveness for sentiment analysis tasks. 展开更多
关键词 Multimodal sentiment analysis spatial position encoding fusion embedding feature loss reduction
在线阅读 下载PDF
A position distribution measurement method and mathematical modeling of two projectiles simultaneous hitting target based on three photoelectric encoder detection screens
2
作者 Hanshan Li Zixuan Cao Xiaoqian Zhang 《Defence Technology(防务技术)》 2025年第11期151-168,共18页
To solve the problem of identification and measurement of two projectiles hitting the target at the same time,this paper proposes a projectile coordinate test method combining three photoelectric encoder detection scr... To solve the problem of identification and measurement of two projectiles hitting the target at the same time,this paper proposes a projectile coordinate test method combining three photoelectric encoder detection screens,and establishes a coordinate calculation model for two projectiles to reach the same detection screen at the same time.The design method of three photoelectric encoder detection screens and the position coordinate recognition algorithm of the blocked array photoelectric detector when projectile passing through the photoelectric encoder detection screen are studied.Using the screen projection method,the intersected linear equation of the projectile and the line laser with the main detection screen as the core coordinate plane is established,and the projectile coordinate data set formed by any two photoelectric encoder detection screens is constructed.The principle of minimum error of coordinate data set is used to determine the coordinates of two projectiles hitting the target at the same time.The rationality and feasibility of the proposed test method are verified by experiments and comparative tests. 展开更多
关键词 Photoelectric encoder detection screen PROJECTILE Matching and recognition Linear laser position distribution
在线阅读 下载PDF
A MIXED FINITE ELEMENT AND UPWIND MIXED FINITE ELEMENT MULTI-STEP METHOD FOR THE THREE-DIMENSIONAL POSITIVE SEMI-DEFINITE DARCY-FORCHHEIMER MISCIBLE DISPLACEMENT PROBLEM
3
作者 Yirang YUAN Changfeng LI +1 位作者 Huailing SONG Tongjun SUN 《Acta Mathematica Scientia》 2025年第2期715-736,共22页
In this paper,a composite numerical scheme is proposed to solve the threedimensional Darcy-Forchheimer miscible displacement problem with positive semi-definite assumptions.A mixed finite element is used for the fow e... In this paper,a composite numerical scheme is proposed to solve the threedimensional Darcy-Forchheimer miscible displacement problem with positive semi-definite assumptions.A mixed finite element is used for the fow equation.The velocity and pressure are computed simultaneously.The accuracy of velocity is improved one order.The concentration equation is solved by using mixed finite element,multi-step difference and upwind approximation.A multi-step method is used to approximate time derivative for improving the accuracy.The upwind approximation and an expanded mixed finite element are adopted to solve the convection and diffusion,respectively.The composite method could compute the diffusion flux and its gradient.It possibly becomes an eficient tool for solving convection-dominated diffusion problems.Firstly,the conservation of mass holds.Secondly,the multi-step method has high accuracy.Thirdly,the upwind approximation could avoid numerical dispersion.Using numerical analysis of a priori estimates and special techniques of differential equations,we give an error estimates for a positive definite problem.Numerical experiments illustrate its computational efficiency and feasibility of application. 展开更多
关键词 Darcy-Forchheimer fow three-dimensional positive semi-definite problem upwind mixed finite element multi-step method conservation of mass convergence analysis
在线阅读 下载PDF
Position Encoding Based Convolutional Neural Networks for Machine Remaining Useful Life Prediction 被引量:5
4
作者 Ruibing Jin Min Wu +3 位作者 Keyu Wu Kaizhou Gao Zhenghua Chen Xiaoli Li 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第8期1427-1439,共13页
Accurate remaining useful life(RUL)prediction is important in industrial systems.It prevents machines from working under failure conditions,and ensures that the industrial system works reliably and efficiently.Recentl... Accurate remaining useful life(RUL)prediction is important in industrial systems.It prevents machines from working under failure conditions,and ensures that the industrial system works reliably and efficiently.Recently,many deep learning based methods have been proposed to predict RUL.Among these methods,recurrent neural network(RNN)based approaches show a strong capability of capturing sequential information.This allows RNN based methods to perform better than convolutional neural network(CNN)based approaches on the RUL prediction task.In this paper,we question this common paradigm and argue that existing CNN based approaches are not designed according to the classic principles of CNN,which reduces their performances.Additionally,the capacity of capturing sequential information is highly affected by the receptive field of CNN,which is neglected by existing CNN based methods.To solve these problems,we propose a series of new CNNs,which show competitive results to RNN based methods.Compared with RNN,CNN processes the input signals in parallel so that the temporal sequence is not easily determined.To alleviate this issue,a position encoding scheme is developed to enhance the sequential information encoded by a CNN.Hence,our proposed position encoding based CNN called PE-Net is further improved and even performs better than RNN based methods.Extensive experiments are conducted on the C-MAPSS dataset,where our PE-Net shows state-of-the-art performance. 展开更多
关键词 Convolutional neural network(CNN) deep learning position encoding remaining useful life prediction
在线阅读 下载PDF
Three-dimensional positions of scattering centers reconstruction from multiple SAR images based on radargrammetry 被引量:3
5
作者 钟金荣 文贡坚 +1 位作者 回丙伟 李德仁 《Journal of Central South University》 SCIE EI CAS CSCD 2015年第5期1776-1789,共14页
A method and procedure is presented to reconstruct three-dimensional(3D) positions of scattering centers from multiple synthetic aperture radar(SAR) images. Firstly, two-dimensional(2D) attribute scattering centers of... A method and procedure is presented to reconstruct three-dimensional(3D) positions of scattering centers from multiple synthetic aperture radar(SAR) images. Firstly, two-dimensional(2D) attribute scattering centers of targets are extracted from 2D SAR images. Secondly, similarity measure is developed based on 2D attributed scatter centers' location, type, and radargrammetry principle between multiple SAR images. By this similarity, we can associate 2D scatter centers and then obtain candidate 3D scattering centers. Thirdly, these candidate scattering centers are clustered in 3D space to reconstruct final 3D positions. Compared with presented methods, the proposed method has a capability of describing distributed scattering center, reduces false and missing 3D scattering centers, and has fewer restrictionson modeling data. Finally, results of experiments have demonstrated the effectiveness of the proposed method. 展开更多
关键词 multiple synthetic aperture radar(SAR) images three-dimensional scattering center position reconstruction radargrammetry
在线阅读 下载PDF
New Three-Dimensional Assessment Model and Optimization of Acoustic Positioning System 被引量:1
6
作者 Lin Zhao Xiaobo Chen +3 位作者 Jianhua Cheng Lianhua Yu Chengcai Lv Jiuru Wang 《Computers, Materials & Continua》 SCIE EI 2020年第8期1005-1023,共19页
This paper addresses the problem of assessing and optimizing the acoustic positioning system for underwater target localization with range measurement.We present a new three-dimensional assessment model to evaluate th... This paper addresses the problem of assessing and optimizing the acoustic positioning system for underwater target localization with range measurement.We present a new three-dimensional assessment model to evaluate the optimal geometric beacon formation whether meets user requirements.For mathematical tractability,it is assumed that the measurements of the range between the target and beacons are corrupted with white Gaussian noise with variance,which is distance-dependent.Then,the relationship between DOP parameters and positioning accuracy can be derived by adopting dilution of precision(DOP)parameters in the assessment model.In addition,the optimal geometric beacon formation yielding the best performance can be achieved via minimizing the values of geometric dilution of precision(GDOP)in the case where the target position is known and fixed.Next,in order to ensure that the estimated positioning accuracy on the region of interest satisfies the precision required by the user,geometric positioning accuracy(GPA),horizontal positioning accuracy(HPA)and vertical positioning accuracy(VPA)are utilized to assess the optimal geometric beacon formation.Simulation examples are designed to illustrate the exactness of the conclusion.Unlike other work that only uses GDOP to optimize the formation and cannot assess the performance of the specified size,this new three-dimensional assessment model can evaluate the optimal geometric beacon formation for each dimension of any point in three-dimensional space,which can provide guidance to optimize the performance of each specified dimension. 展开更多
关键词 Acoustic positioning system three-dimensional assessment model positioning accuracy DOP optimal configuration
在线阅读 下载PDF
Automatic UAV Positioning with Encoded Sign as Cooperative Target
7
作者 Xu Zhongxiong Shao Guiwei +2 位作者 Wu Liang Xie Yuxing Ji Zheng 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2017年第6期669-679,共11页
In order to achieve the goal that unmanned aerial vehicle(UAV)automatically positioning during power inspection,a visual positioning method which utilizes encoded sign as cooperative target is proposed.Firstly,we disc... In order to achieve the goal that unmanned aerial vehicle(UAV)automatically positioning during power inspection,a visual positioning method which utilizes encoded sign as cooperative target is proposed.Firstly,we discuss how to design the encoded sign and propose a robust decoding algorithm based on contour.Secondly,the Adaboost algorithm is used to train a classifier which can detect the encoded sign from image.Lastly,the position of UAV can be calculated by using the projective relation between the object points and their corresponding image points.Experiment includes two parts.First,simulated video data is used to verify the feasibility of the proposed method,and the results show that the average absolute error in each direction is below 0.02 m.Second,a video,acquired from an actual UAV flight,is used to calculate the position of UAV.The results show that the calculated trajectory is consistent with the actual flight path.The method runs at a speed of 0.153 sper frame. 展开更多
关键词 unmanned aerial vehicle(UAV) cooperative target encoded sign visual positioning
在线阅读 下载PDF
Integration system research and development for three-dimensional laser scanning information visualization in goaf 被引量:2
8
作者 罗周全 黄俊杰 +2 位作者 罗贞焱 汪伟 秦亚光 《Transactions of Nonferrous Metals Society of China》 SCIE EI CAS CSCD 2016年第7期1985-1994,共10页
An integration processing system of three-dimensional laser scanning information visualization in goaf was developed. It is provided with multiple functions, such as laser scanning information management for goaf, clo... An integration processing system of three-dimensional laser scanning information visualization in goaf was developed. It is provided with multiple functions, such as laser scanning information management for goaf, cloud data de-noising optimization, construction, display and operation of three-dimensional model, model editing, profile generation, calculation of goaf volume and roof area, Boolean calculation among models and interaction with the third party soft ware. Concerning this system with a concise interface, plentiful data input/output interfaces, it is featured with high integration, simple and convenient operations of applications. According to practice, in addition to being well-adapted, this system is favorably reliable and stable. 展开更多
关键词 GOAF laser scanning visualization integration system 1 Introduction The goaf formed through underground mining of mineral resources is one of the main disaster sources threatening mine safety production [1 2]. Effective implementation of goaf detection and accurate acquisition of its spatial characteristics including the three-dimensional morphology the spatial position as well as the actual boundary and volume are important basis to analyze predict and control disasters caused by goaf. In recent years three-dimensional laser scanning technology has been effectively applied in goaf detection [3 4]. Large quantities of point cloud data that are acquired for goaf by means of the three-dimensional laser scanning system are processed relying on relevant engineering software to generate a three-dimensional model for goaf. Then a general modeling analysis and processing instrument are introduced to perform subsequent three-dimensional analysis and calculation [5 6]. Moreover related development is also carried out in fields such as three-dimensional detection and visualization of hazardous goaf detection and analysis of unstable failures in goaf extraction boundary acquisition in stope visualized computation of damage index aided design for pillar recovery and three-dimensional detection
在线阅读 下载PDF
Effects of the initiation position on the damage and fracture characteristics of linear-charge blasting in rock 被引量:4
9
作者 Chenxi Ding Renshu Yang +3 位作者 Xiao Guo Zhe Sui Chenglong Xiao Liyun Yang 《International Journal of Minerals,Metallurgy and Materials》 SCIE EI CAS CSCD 2024年第3期443-451,共9页
To study the effects of the initiation position on the damage and fracture characteristics of linear-charge blasting, blasting model experiments were conducted in this study using computed tomography scanning and thre... To study the effects of the initiation position on the damage and fracture characteristics of linear-charge blasting, blasting model experiments were conducted in this study using computed tomography scanning and three-dimensional reconstruction methods. The fractal damage theory was used to quantify the crack distribution and damage degree of sandstone specimens after blasting. The results showed that regardless of an inverse or top initiation, due to compression deformation and sliding frictional resistance, the plugging medium of the borehole is effective. The energy of the explosive gas near the top of the borehole is consumed. This affects the effective crushing of rocks near the top of the borehole, where the extent of damage to Sections Ⅰ and Ⅱ is less than that of Sections Ⅲ and Ⅳ. In addition, the analysis revealed that under conditions of top initiation, the reflected tensile damage of the rock at the free face of the top of the borehole and the compression deformation of the plug and friction consume more blasting energy, resulting in lower blasting energy efficiency for top initiation. As a result, the overall damage degree of the specimens in the top-initiation group was significantly smaller than that in the inverse-initiation group. Under conditions of inverse initiation, the blasting energy efficiency is greater, causing the specimen to experience greater damage. Therefore, in the engineering practice of rock tunnel cut blasting, to utilize blasting energy effectively and enhance the effects of rock fragmentation, using the inverse-initiation method is recommended. In addition, in three-dimensional(3D) rock blasting, the bottom of the borehole has obvious end effects under the conditions of inverse initiation, and the crack distribution at the bottom of the borehole is trumpet-shaped. The occurrence of an end effect in the 3D linear-charge blasting model experiment is related to the initiation position and the blocking condition. 展开更多
关键词 BLASTING linear charge initiation position computed tomography three-dimensional reconstruction damage
在线阅读 下载PDF
PCATNet: Position-Class Awareness Transformer for Image Captioning
10
作者 Ziwei Tang Yaohua Yi +1 位作者 Changhui Yu Aiguo Yin 《Computers, Materials & Continua》 SCIE EI 2023年第6期6007-6022,共16页
Existing image captioning models usually build the relation between visual information and words to generate captions,which lack spatial infor-mation and object classes.To address the issue,we propose a novel Position... Existing image captioning models usually build the relation between visual information and words to generate captions,which lack spatial infor-mation and object classes.To address the issue,we propose a novel Position-Class Awareness Transformer(PCAT)network which can serve as a bridge between the visual features and captions by embedding spatial information and awareness of object classes.In our proposal,we construct our PCAT network by proposing a novel Grid Mapping Position Encoding(GMPE)method and refining the encoder-decoder framework.First,GMPE includes mapping the regions of objects to grids,calculating the relative distance among objects and quantization.Meanwhile,we also improve the Self-attention to adapt the GMPE.Then,we propose a Classes Semantic Quantization strategy to extract semantic information from the object classes,which is employed to facilitate embedding features and refining the encoder-decoder framework.To capture the interaction between multi-modal features,we propose Object Classes Awareness(OCA)to refine the encoder and decoder,namely OCAE and OCAD,respectively.Finally,we apply GMPE,OCAE and OCAD to form various combinations and to complete the entire PCAT.We utilize the MSCOCO dataset to evaluate the performance of our method.The results demonstrate that PCAT outperforms the other competitive methods. 展开更多
关键词 Image captioning relative position encoding object classes awareness
在线阅读 下载PDF
Three-dimensional Information Decoupling System Based on PSD and Deviation Correction
11
作者 YAN Chao-chao LU Jin YANG Hai-ma 《International English Education Research》 2015年第3期101-107,共7页
Three-dimensional Information Decoupling System Based on PSD were designed based on LabVIEW, in order to achieve precision, timeliness, reliability require-ments of the PSD used in the ATP system of Satellite Earth qu... Three-dimensional Information Decoupling System Based on PSD were designed based on LabVIEW, in order to achieve precision, timeliness, reliability require-ments of the PSD used in the ATP system of Satellite Earth quantum communication. Firstly, the laser light source was driven by a stepper motor to scan on the PSD photosensitive surface, and the voltage value was collected and calculated to get the spot position. Analyzing the cause of nonlinear, a mathematical model was built between the actual value and the measured value by using binary quadratic polynomial method, PSD nonlinear correction function would be got. Then, the object micro displacement and angle offset were measured by combining optical triangulation method, and the error of the measurement results was corrected. Experimental results showed that, after the correction, the measuring deviation could be significantly reduced, the PSD performance calibration requirements was achieved, the efficiency of the system was developed greatly by using LabVIEW. 展开更多
关键词 position sensitive detector (PSD) three-dimensional Information Decoupling System Binary quadratic polynomial method Microdisplacement measurement Angle Measurement
在线阅读 下载PDF
基于双向时序窗口Transformer的网络入侵检测方法
12
作者 王长浩 王明阳 +1 位作者 丁磊 刘凯 《计算机应用研究》 北大核心 2026年第1期271-279,共9页
近年来,网络攻击的高度动态化、隐蔽化给互联网的安全和稳定带来了极大的威胁。针对现有网络入侵检测方法在局部时序建模精度不足及多分类下少数类识别能力不佳等问题,提出了一种基于双向时间滑动窗口Transformer的网络异常流量检测方... 近年来,网络攻击的高度动态化、隐蔽化给互联网的安全和稳定带来了极大的威胁。针对现有网络入侵检测方法在局部时序建模精度不足及多分类下少数类识别能力不佳等问题,提出了一种基于双向时间滑动窗口Transformer的网络异常流量检测方法。该方法将网络流量数据转换为突出时序关系的三维序列数据,引入可学习的嵌入编码及上下文位置编码,以增强序列特征的表现能力,提升了异常流量检测的准确率和稳定性,并在UNSW-NB15、CIC-IDS-2017公开数据集上进行了验证。实验结果表明,所提方法均表现出较好的性能优势,在二分类任务中检测准确率分别为99.79%、99.77%;在多分类任务中,准确率分别达到98.48%、99.76%,性能均显著高于其他先进深度学习模型。综上,该方法有效提升了网络异常流量检测的准确性和对少数类攻击的识别能力,为网络安全防护提供了新的技术手段。 展开更多
关键词 入侵检测 网络流量 双向时间窗口 上下文位置编码
在线阅读 下载PDF
融合位置编码和重叠掩模的低重叠点云配准网络
13
作者 喇孝伟 胡立华 +2 位作者 胡建华 姚晓玲 王欣波 《计算机应用》 北大核心 2026年第2期536-545,共10页
针对低重叠场景下点云配准方法存在的关键点特征描述信息不足和重叠点云区域较少,进而导致点云的误匹配率高以及配准精度低的问题,设计一种融合位置编码和重叠掩模的低重叠点云配准网络,以降低点云的误匹配率,并提高配准的精度。首先,采... 针对低重叠场景下点云配准方法存在的关键点特征描述信息不足和重叠点云区域较少,进而导致点云的误匹配率高以及配准精度低的问题,设计一种融合位置编码和重叠掩模的低重叠点云配准网络,以降低点云的误匹配率,并提高配准的精度。首先,采用PointNet逐点特征编码器提取点云关键点,并融合关键点的特征信息、坐标信息和位置编码,生成更具判别力的关键点特征;其次,将融合后的特征输入自注意力和交叉注意力模块,以增强点云特征的描述能力,加强点云的上下文信息交互,从而解决关键点描述信息不足的问题;再次,在注意力模块之后引入重叠掩模模块,通过学习重叠掩模去除非重叠区域的关键点,以降低误匹配率;最后,结合Sinkhorn算法进行最优匹配,并采用迭代最近邻点(ICP)算法进行细化,提高点云配准精度。在CODD数据集和KITTI数据集上与多种现有的低重叠点云配准方法进行对比的实验结果表明,经过ICP细化后的网络性能更优,特别是在CODD数据集上,它比当前先进的低重叠点云配准方法 CoFiI2P(Coarse-to-Fine correspondences for Image-to-Point cloud registration)的相对平移误差(RTE)和相对旋转误差(RRE)分别降低了53.29%和42.72%,配准召回率(RR)提升了0.2个百分点。可见,该网络能充分提取关键点特征的描述信息,并有效提升低重叠场景下的点云配准精度。 展开更多
关键词 低重叠场景 点云配准 位置编码 重叠掩模 自注意力 交叉注意力
在线阅读 下载PDF
基于双编码器和知识增强的方面情感三元组抽取
14
作者 邓飞 韩虎 +1 位作者 穆一茹 徐学锋 《计算机工程与科学》 北大核心 2026年第2期299-308,共10页
方面情感三元组抽取旨在从句子中识别方面词,意见词以及相应的情感极性。针对现有研究未充分考虑方面词和意见词之间的关联关系,以及存在语义信息提取不充分和背景知识利用不全面的问题,提出一种基于双编码器和知识增强的方面情感三元... 方面情感三元组抽取旨在从句子中识别方面词,意见词以及相应的情感极性。针对现有研究未充分考虑方面词和意见词之间的关联关系,以及存在语义信息提取不充分和背景知识利用不全面的问题,提出一种基于双编码器和知识增强的方面情感三元组抽取模型。首先,同时使用BERT和Bi-LSTM双编码器从不同层面挖掘句子中的语义信息,并融合外部情感知识增强文本的情感表达;其次,通过位置嵌入的交互注意力对方面和意见之间的关系进行交互迭代学习;最后,利用边界驱动表格填充的方法预测三元组。实验结果表明,该模型与主流模型GTS-BERT相比,在4个公开数据集上的F 1值分别提高了9.43个百分点、7.32个百分点、7.43个百分点和4.78个百分点,能够准确地进行三元组的提取。 展开更多
关键词 方面情感三元组抽取 双编码器 交互注意力 情感知识 位置嵌入
在线阅读 下载PDF
一种改进扩散Transformer的文本到图像生成方法
15
作者 童至慧 孙立洋 +1 位作者 董雪 许硕贵 《计算机与现代化》 2026年第1期76-82,共7页
扩散Transformer模型是当前文本到图像生成任务的主流模型,但其往往存在图像结构性刻画较差、高分辨率图像训练代价大等问题,针对这些问题,本文提出一种改进的扩散Transformer模型U-CrossDiT。考虑图像数据的二维位置特征,本文首先引入... 扩散Transformer模型是当前文本到图像生成任务的主流模型,但其往往存在图像结构性刻画较差、高分辨率图像训练代价大等问题,针对这些问题,本文提出一种改进的扩散Transformer模型U-CrossDiT。考虑图像数据的二维位置特征,本文首先引入二维的旋转位置编码来增强模型对图像二维结构位置信息的理解;同时,改进注意力机制和网络结构,进一步提升模型能力。为解决高分辨率图像生成时训练代价大的问题,本文引入位置内插的方式,利用二维旋转位置编码的位置外推能力,显著降低从低分辨率到高分辨率训练的计算成本。该模型在MS-COCO数据上的自动评估指标和人工评估结论均明显优于其他文本到图像的生成模型,同时消融实验验证了二维旋转位置编码和网络结构改进的有效性,结果表明本文模型在生成图像的画面质量、主体结构刻画、图像与文本输入的相关性等层面都具备明显的效果优势。 展开更多
关键词 文本到图像生成 扩散模型 Transformer模型 位置编码 深度学习
在线阅读 下载PDF
融合多模态信息与位置编码的阿尔茨海默病诊断
16
作者 刘蓉 刘汝璇 +3 位作者 李广昶 柴新宇 谭桂梅 唐奇伶 《中南民族大学学报(自然科学版)》 2026年第2期212-220,共9页
阿尔茨海默病(Alzheimer's Disease, AD)作为一种致命的神经退行性疾病,其早期诊断与病理区域的精确预测对于延缓病情进展和改善患者预后具有极其重要的意义,尽管过去的研究已经在自动化诊断技术上取得了进展,现有方法在疾病的诊断... 阿尔茨海默病(Alzheimer's Disease, AD)作为一种致命的神经退行性疾病,其早期诊断与病理区域的精确预测对于延缓病情进展和改善患者预后具有极其重要的意义,尽管过去的研究已经在自动化诊断技术上取得了进展,现有方法在疾病的诊断准确率已经有着不错的水准,但其模型的可解释性仍是困扰临床研究的最大问题.针对这一背景,提出了一种结合三维位置编码与多模态的阿尔茨海默病诊断模型,该模型将三维位置编码、Transformer自注意力机制和全卷积网络(FCN)有机结合,能够从三维医学影像数据中自动提取有效特征,生成代表整个大脑的高分辨率疾病概率图,并通过多模态注意力机制将此概率图与客观临床信息有机融合,实现对AD的精准预测诊断的同时,为模型决策过程提供更多的可解释层面. 展开更多
关键词 阿尔茨海默病 磁共振影像 全卷积网络 三维位置编码 多模态注意力
在线阅读 下载PDF
基于因果干预的BEV车道线检测
17
作者 李睿豪 于红绯 《电子测量技术》 北大核心 2026年第1期226-236,共11页
针对如光照突然变化、极端天气等环境干扰导致的鸟瞰图车道线检测中的特征模糊和误检问题,本文提出了一种基于因果干预的BEV车道检测框架。首先,为提升BEV空间转换过程中特征的表示效果,设计复合位置编码并融合至前视图特征,以保持空间... 针对如光照突然变化、极端天气等环境干扰导致的鸟瞰图车道线检测中的特征模糊和误检问题,本文提出了一种基于因果干预的BEV车道检测框架。首先,为提升BEV空间转换过程中特征的表示效果,设计复合位置编码并融合至前视图特征,以保持空间连续性与一致性。其次,在获取BEV特征后构建因果干预模块,因果干预模块旨在通过生成反事实特征来显式地将车道线特征与环境干扰解耦,以提高模型在极端环境中的稳定性。最后,通过引入特征融合模块完成多尺度特征的动态校准与干扰抑制,并利用全局注意力机制实现BEV特征的增强。实验结果表明,在Apollo数据集的三个子集中,相比于性能第2的模型,F1值提高了0.8%、1%、3%;在OpenLane数据集内的包含极端天气、夜间及交叉路口等挑战性场景中,F1值也达到了最佳。成功实现了车道线特征与环境干扰的显式解耦,为复杂环境下的自动驾驶感知提供了高鲁棒性解决方案。 展开更多
关键词 计算机视觉 车道线检测 因果干预 位置编码 鸟瞰图
原文传递
基于多维频域特征融合的人物交互检测
18
作者 樊跃波 陈明轩 +2 位作者 汤显 高永彬 李文超 《计算机应用》 北大核心 2026年第2期580-586,共7页
人物交互(HOI)检测任务的目标是检测图像中所有人与物体之间的交互关系。目前的研究大多采用编码器-解码器结构进行端到端的训练,依赖绝对位置编码(APE),且在复杂的多对象交互场景中表现欠佳。针对现有方法依赖APE,难以有效捕捉人与物... 人物交互(HOI)检测任务的目标是检测图像中所有人与物体之间的交互关系。目前的研究大多采用编码器-解码器结构进行端到端的训练,依赖绝对位置编码(APE),且在复杂的多对象交互场景中表现欠佳。针对现有方法依赖APE,难以有效捕捉人与物体之间的相对空间关系,以及在复杂多对象交互场景中局部与全局信息整合不足的问题,提出一种结合跨维度交互特征提取与频域特征融合的HOI检测模型。首先,改进传统的Transformer编码器,额外引入一种相对位置编码(RPE),并通过融合RPE与APE,使模型能够对人与物体之间的相对关系进行建模;其次,引入一种新的特征提取模块加强图像信息的整合,即通过跨维度交互捕捉图像中通道、空间和特征维度的交互特征,提升信息表达能力,同时利用离散余弦变换(DCT)提取频域特征,从而捕捉更丰富的局部与全局信息;最后,使用Wise-IoU损失函数提升检测精度与类别区分能力,使模型可以更灵活地处理不同类别的目标。实验在HICO-DET和V-COCO两个公开数据集上进行,结果表明,与GEN-VLKT(Guided Embedding Network Visual-Linguistic Knowledge Transfer)模型相比,所提模型在HICO-DET数据集全部种类上的平均精度均值(mAP)提升了0.95个百分点,在VCOCO数据集场景1上的AP提升了0.9个百分点。 展开更多
关键词 人物交互检测 目标检测 相对位置编码 频域特征 离散余弦变换
在线阅读 下载PDF
基于定位细化的3D点云上采样方法研究
19
作者 何龙 吴智赫 +1 位作者 刘吴杰 匡振中 《软件工程》 2026年第1期32-37,54,共7页
针对目前点云采样方法的细节缺失问题,提出了一种基于定位细化的多层3D点云上采样方法,以稀疏点云作为输入,实现高质量点云上采样。该方法主要由粗糙点云生成和定位细化两个部分组成。在粗糙点云生成阶段,利用基于平面折叠的自编码器架... 针对目前点云采样方法的细节缺失问题,提出了一种基于定位细化的多层3D点云上采样方法,以稀疏点云作为输入,实现高质量点云上采样。该方法主要由粗糙点云生成和定位细化两个部分组成。在粗糙点云生成阶段,利用基于平面折叠的自编码器架构将输入的稀疏三维点云进行粗糙重构;在定位细化阶段,利用细化单元与偏移回归机制对粗糙点云进行位置修正,引入局部细化与全局细化模块以学习更精细的几何结构。与已有同类方法比较,所提方法不但在形状采样任务上取得了更好的成效,而且在形状补全任务上也展现出优秀的能力。 展开更多
关键词 点云上采样 平面折叠编码器 定位细化 偏移回归
在线阅读 下载PDF
Bidirectional Transformer with absolute-position aware relative position encoding for encoding sentences 被引量:2
20
作者 Le QI Yu ZHANG Ting LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期63-71,共9页
Transformers have been widely studied in many natural language processing (NLP) tasks, which can capture the dependency from the whole sentence with a high parallelizability thanks to the multi-head attention and the ... Transformers have been widely studied in many natural language processing (NLP) tasks, which can capture the dependency from the whole sentence with a high parallelizability thanks to the multi-head attention and the position-wise feed-forward network. However, the above two components of transformers are position-independent, which causes transformers to be weak in modeling sentence structures. Existing studies commonly utilized positional encoding or mask strategies for capturing the structural information of sentences. In this paper, we aim at strengthening the ability of transformers on modeling the linear structure of sentences from three aspects, containing the absolute position of tokens, the relative distance, and the direction between tokens. We propose a novel bidirectional Transformer with absolute-position aware relative position encoding (BiAR-Transformer) that combines the positional encoding and the mask strategy together. We model the relative distance between tokens along with the absolute position of tokens by a novel absolute-position aware relative position encoding. Meanwhile, we apply a bidirectional mask strategy for modeling the direction between tokens. Experimental results on the natural language inference, paraphrase identification, sentiment classification and machine translation tasks show that BiAR-Transformer achieves superior performance than other strong baselines. 展开更多
关键词 TRANSFORMER relative position encoding bidirectional mask strategy sentence encoder
原文传递
上一页 1 2 25 下一页 到第
使用帮助 返回顶部