期刊文献+
共找到39,619篇文章
< 1 2 250 >
每页显示 20 50 100
Relevant Visual Semantic Context-Aware Attention-Based Dialog
1
作者 Eugene Tan Boon Hong Yung-Wey Chong +1 位作者 Tat-Chee Wan Kok-Lim Alvin Yau 《Computers, Materials & Continua》 SCIE EI 2023年第8期2337-2354,共18页
The existing dataset for visual dialog comprises multiple rounds of questions and a diverse range of image contents.However,it faces challenges in overcoming visual semantic limitations,particularly in obtaining suffi... The existing dataset for visual dialog comprises multiple rounds of questions and a diverse range of image contents.However,it faces challenges in overcoming visual semantic limitations,particularly in obtaining sufficient context from visual and textual aspects of images.This paper proposes a new visual dialog dataset called Diverse History-Dialog(DS-Dialog)to address the visual semantic limitations faced by the existing dataset.DS-Dialog groups relevant histories based on their respective Microsoft Common Objects in Context(MSCOCO)image categories and consolidates them for each image.Specifically,each MSCOCO image category consists of top relevant histories extracted based on their semantic relationships between the original image caption and historical context.These relevant histories are consolidated for each image,and DS-Dialog enhances the current dataset by adding new context-aware relevant history to provide more visual semantic context for each image.The new dataset is generated through several stages,including image semantic feature extraction,keyphrase extraction,relevant question extraction,and relevant history dialog generation.The DS-Dialog dataset contains about 2.6 million question-answer pairs,where 1.3 million pairs correspond to existing VisDial’s question-answer pairs,and the remaining 1.3 million pairs include a maximum of 5 image features for each VisDial image,with each image comprising 10-round relevant question-answer pairs.Moreover,a novel adaptive relevant history selection is proposed to resolve missing visual semantic information for each image.DS-Dialog is used to benchmark the performance of previous visual dialog models and achieves better performance than previous models.Specifically,the proposed DSDialog model achieves an 8% higher mean reciprocal rank(MRR),11% higher R@1%,6% higher R@5%,5% higher R@10%,and 8% higher normalized discounted cumulative gain(NDCG)compared to LF.DS-Dialog also achieves approximately 1 point improvement on R@k,mean,MRR,and NDCG compared to the original RVA,and 2 points improvement compared to LF andDualVD.These results demonstrates the importance of the relevant semantic historical context in enhancing the visual semantic relationship between textual and visual representations of the images and questions. 展开更多
关键词 Visual dialog context-aware relevant history computer vision natural language processing
在线阅读 下载PDF
GNSS失锁下基于CNN-BiLSTM-Attention模型的机载组合导航算法
2
作者 赵桂玲 汪远 +1 位作者 石茜宇 周彤 《中国惯性技术学报》 北大核心 2026年第1期60-66,72,共8页
针对全球导航卫星定位系统(GNSS)信号失锁导致惯性导航系统(INS)/GNSS组合导航系统误差发散的问题,提出了一种基于CNN-BiLSTM-Attention模型的机载组合导航算法。通过将注意力机制引入CNN-BiLSTM中,构建CNN-BiLSTM-Attention模型,利用G... 针对全球导航卫星定位系统(GNSS)信号失锁导致惯性导航系统(INS)/GNSS组合导航系统误差发散的问题,提出了一种基于CNN-BiLSTM-Attention模型的机载组合导航算法。通过将注意力机制引入CNN-BiLSTM中,构建CNN-BiLSTM-Attention模型,利用GNSS信号正常时的惯性测量单元输出信息、INS姿态信息及GNSS导航信息训练模型,以预测信号失锁时的GNSS导航信息,从而解决信息缺失问题并提升飞行轨迹预测精度。实验结果表明:在GNSS信号失锁且飞行轨迹发生突变时,基于CNN-BiLSTM-Attention模型的组合导航系统定位精度优于BiLSTM与CNN-BiLSTM模型:相较于BiLSTM模型,速度精度提高26.74%~72.97%,位置精度提高28.67%~65.22%;相较于CNN-BiLSTM模型,速度精度提高3.33%~28.57%,位置精度提高2.88%~32.03%。 展开更多
关键词 GNSS信号失锁 INS/GNSS组合导航系统 CNN-BiLSTM-attention模型 轨迹突变
在线阅读 下载PDF
基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法
3
作者 刘凯伦 孙广玲 陆小锋 《工业控制计算机》 2026年第1期122-124,共3页
随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法... 随着光伏发电在全球能源体系中占比不断提升,超短期光伏发电量预测对电力系统调度与安全运行至关重要。然而,光伏发电量受多因素影响,具有显著随机性与波动性。为此,提出了一种基于TCN-BiLSTM-Attention模型的超短期光伏发电量预测方法。首先通过皮尔逊相关分析筛选关键特征,并利用孤立森林算法检测异常值,结合线性插值法和标准化完成数据预处理。随后,通过时间卷积网络(Temporal Convolutional Network,TCN)提取时序特征,再利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)网络捕获前后向时间依赖关系,并在输出端引入注意力机制聚焦关键时间步特征。最后,在Desert Knowledge Australia Solar Centre(DKASC)数据集上的对比实验表明,与传统LSTM、BiLSTM模型相比,提出的TCN-BiLSTM-Attention模型在预测精度、稳定性等方面均表现出一定优势。 展开更多
关键词 TCN BiLSTM attention 发电量超短期预测
在线阅读 下载PDF
基于RF和Self-attention改进LSTM的大坝变形预测方法及异常值判定
4
作者 都旭煌 田振宇 +5 位作者 齐智勇 毛延翩 汤正阳 王波 远近 牟猷 《水电能源科学》 北大核心 2026年第2期167-173,共7页
变形是反映大坝结构性态的直观物理量,提升变形预测精度是保障大坝安全稳定运行的关键。基于变形统计模型提取变形影响因子,结合随机森林(RF)实现因子优选,并利用自注意力机制(Self-attention)优化长短期记忆神经网络(LSTM),继而发展了... 变形是反映大坝结构性态的直观物理量,提升变形预测精度是保障大坝安全稳定运行的关键。基于变形统计模型提取变形影响因子,结合随机森林(RF)实现因子优选,并利用自注意力机制(Self-attention)优化长短期记忆神经网络(LSTM),继而发展了一种新型变形预测模型。首先根据统计模型中包含的影响因子构建初始因子集合;其次基于RF筛选对变形影响程度较高的因子参与预测建模,以降低模型复杂度、提升变形预测精度;最后在LSTM算法基础上引入Self-attention策略,提升算法对变形时序关系的挖掘能力,从而实现RF-LSTM/Self-attention变形预测模型的构建。案例结果表明,所提方法变形预测精度高于对比方法,对应均方根误差、平均绝对误差、决定系数的最大提升比分别为57.81%、59.59%、5.94%,验证了RF-LSTM/Self-attention模型在大坝变形预测领域的有效性。将所提方法应用到变形异常识别中,可有效判定存在于变形中的异常数据,验证了所提变形预测方法的可拓展能力。 展开更多
关键词 大坝变形预测 随机森林 因子优选 自注意力机制 LSTM 异常值判定
原文传递
基于LSTM-Attention模型的深基坑变形时序预测方法
5
作者 许嘉辉 《山西建筑》 2026年第7期94-97,102,共5页
基坑监测数据具有显著的时序特征,LSTM的门控机制可有效建模其非线性动态。基于此,文中提出了一种基于LSTM-Attention模型的深基坑变形时序预测方法。该方法通过构建输入层、LSTM层、Attention层、全连接层和输出层的神经网络结构,有效... 基坑监测数据具有显著的时序特征,LSTM的门控机制可有效建模其非线性动态。基于此,文中提出了一种基于LSTM-Attention模型的深基坑变形时序预测方法。该方法通过构建输入层、LSTM层、Attention层、全连接层和输出层的神经网络结构,有效地捕捉深基坑变形数据中的长期依赖关系和时间序列特征。通过对深基坑施工现场的位移、沉降、土压力等监测数据进行处理,并应用所提出的模型进行训练和预测,实现了对深基坑变形趋势的准确预测。实验结果表明,该模型在预测精度和拟合度方面表现优异,具有良好的泛化能力和实际应用价值。 展开更多
关键词 attention机制 深基坑变形 时序分析
在线阅读 下载PDF
Speech Emotion Recognition Based on the Adaptive Acoustic Enhancement and Refined Attention Mechanism
6
作者 Jun Li Chunyan Liang +1 位作者 Zhiguo Liu Fengpei Ge 《Computers, Materials & Continua》 2026年第3期2015-2039,共25页
To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM... To enhance speech emotion recognition capability,this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup(AAM)and improved coordinate and shuffle attention(ICASA)methods.The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients,thus enabling information fusion of speech data with different emotions at the acoustic level.The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention(ICA)and shuffle attention(SA)techniques.The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights.The SA technique promotes feature interaction through channel shuffling,which helps the model learn richer and more discriminative emotional features.Experimental results demonstrate that,compared to the baseline model,the proposed model improves the weighted accuracy by 5.42%and 4.54%,and the unweighted accuracy by 3.37%and 3.85%on the IEMOCAP and RAVDESS datasets,respectively.These improvements were confirmed to be statistically significant by independent samples t-tests,further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems. 展开更多
关键词 Speech emotion recognition adaptive acoustic mixup enhancement improved coordinate attention shuffle attention attention mechanism deep learning
在线阅读 下载PDF
A Hierarchical Attention Framework for Business Information Systems:Theoretical Foundation and Proof-of-Concept Implementation
7
作者 Sabina-Cristiana Necula Napoleon-Alexandru Sireteanu 《Computers, Materials & Continua》 2026年第2期2055-2088,共34页
Modern business information systems face significant challenges in managing heterogeneous data sources,integrating disparate systems,and providing real-time decision support in complex enterprise environments.Contempo... Modern business information systems face significant challenges in managing heterogeneous data sources,integrating disparate systems,and providing real-time decision support in complex enterprise environments.Contemporary enterprises typically operate 200+interconnected systems,with research indicating that 52% of organizations manage three or more enterprise content management systems,creating information silos that reduce operational efficiency by up to 35%.While attention mechanisms have demonstrated remarkable success in natural language processing and computer vision,their systematic application to business information systems remains largely unexplored.This paper presents the theoretical foundation for a Hierarchical Attention-Based Business Information System(HABIS)framework that applies multi-level attention mechanisms to enterprise environments.We provide a comprehensive mathematical formulation of the framework,analyze its computational complexity,and present a proof-of-concept implementation with simulation-based validation that demonstrates a 42% reduction in crosssystem query latency compared to legacy ERP modules and 70% improvement in prediction accuracy over baseline methods.The theoretical framework introduces four hierarchical attention levels:system-level attention for dynamic weighting of business systems,process-level attention for business process prioritization,data-level attention for critical information selection,and temporal attention for time-sensitive pattern recognition.Our complexity analysis demonstrates that the framework achieves O(n log n)computational complexity for attention computation,making it scalable to large enterprise environments including retail supply chains with 200+system-scale deployments.The proof-of-concept implementation validates the theoretical framework’s feasibility withMSE loss of 0.439 and response times of 0.000120 s per query,demonstrating its potential for addressing key challenges in business information systems.This work establishes a foundation for future empirical research and practical implementation of attention-driven enterprise systems. 展开更多
关键词 attention mechanisms business information systems theoretical framework enterprise architecture complex systems hierarchical attention
在线阅读 下载PDF
基于CNN-BiLSTM-Cross Attention动态集成模型的短期负荷曲线预测方法
8
作者 杨菁 李丹 +1 位作者 王佳秋 张闯 《电工技术》 2026年第2期75-79,共5页
电力市场化改革及经济的快速发展促使发电企业和供电公司更加依赖准确的短期负荷预测来进行有效的市场运作和盈利规划,然而传统模型难以有效提取和表征高维负荷曲线中的关键特征,如负荷特性、气象条件、日期周期性特征等,特别是在处理... 电力市场化改革及经济的快速发展促使发电企业和供电公司更加依赖准确的短期负荷预测来进行有效的市场运作和盈利规划,然而传统模型难以有效提取和表征高维负荷曲线中的关键特征,如负荷特性、气象条件、日期周期性特征等,特别是在处理多变量之间的交互作用时表现不佳。对此,提出一种基于CNN-BiLSTM-Cross Attention的短期负荷预测模型来预测未来几天内的负荷曲线,该模型利用CNN从负荷曲线中提取局部特征后通过BiLSTM捕捉长期依赖关系,并通过交叉注意机制实现负荷特性、气象特征、节假日效应等多模态信息的深度融合。实验结果表明,与传统方法相比,所提模型在预测准确性和计算效率方面均有显著提升,尤其在处理包含可再生能源的动态电力系统时表现优越。 展开更多
关键词 短期负荷曲线预测 CNN-BiLSTM-Cross attention 多模态信息 负荷特性 气象特征 节假日效应
在线阅读 下载PDF
SwinHCAD: A Robust Multi-Modality Segmentation Model for Brain Tumors Using Transformer and Channel-Wise Attention
9
作者 Seyong Jin Muhammad Fayaz +2 位作者 L.Minh Dang Hyoung-Kyu Song Hyeonjoon Moon 《Computers, Materials & Continua》 2026年第1期511-533,共23页
Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the b... Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics.While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information,existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors.In order to address these challenges and maximize the performance of brain tumor segmentation,this research introduces a novel SwinUNETR-based model by integrating a new decoder block,the Hierarchical Channel-wise Attention Decoder(HCAD),into a powerful SwinUNETR encoder.The HCAD decoder block utilizes hierarchical features and channelspecific attention mechanisms to further fuse information at different scales transmitted from the encoder and preserve spatial details throughout the reconstruction phase.Rigorous evaluations on the recent BraTS GLI datasets demonstrate that the proposed SwinHCAD model achieved superior and improved segmentation accuracy on both the Dice score and HD95 metrics across all tumor subregions(WT,TC,and ET)compared to baseline models.In particular,the rationale and contribution of the model design were clarified through ablation studies to verify the effectiveness of the proposed HCAD decoder block.The results of this study are expected to greatly contribute to enhancing the efficiency of clinical diagnosis and treatment planning by increasing the precision of automated brain tumor segmentation. 展开更多
关键词 attention mechanism brain tumor segmentation channel-wise attention decoder deep learning medical imaging MRI TRANSFORMER U-Net
在线阅读 下载PDF
Global context-aware multi-scale feature iterative refinement for aviation-road traffic semantic segmentation
10
作者 Mengyue ZHANG Shichun YANG +1 位作者 Xinjie FENG Yaoguang CAO 《Chinese Journal of Aeronautics》 2026年第2期429-441,共13页
Semantic segmentation for mixed scenes of aerial remote sensing and road traffic is one of the key technologies for visual perception of flying cars.The State-of-the-Art(SOTA)semantic segmentation methods have made re... Semantic segmentation for mixed scenes of aerial remote sensing and road traffic is one of the key technologies for visual perception of flying cars.The State-of-the-Art(SOTA)semantic segmentation methods have made remarkable achievements in both fine-grained segmentation and real-time performance.However,when faced with the huge differences in scale and semantic categories brought about by the mixed scenes of aerial remote sensing and road traffic,they still face great challenges and there is little related research.Addressing the above issue,this paper proposes a semantic segmentation model specifically for mixed datasets of aerial remote sensing and road traffic scenes.First,a novel decoding-recoding multi-scale feature iterative refinement structure is proposed,which utilizes the re-integration and continuous enhancement of multi-scale information to effectively deal with the huge scale differences between cross-domain scenes,while using a fully convolutional structure to ensure the lightweight and real-time requirements.Second,a welldesigned cross-window attention mechanism combined with a global information integration decoding block forms an enhanced global context perception,which can effectively capture the long-range dependencies and multi-scale global context information of different scenes,thereby achieving fine-grained semantic segmentation.The proposed method is tested on a large-scale mixed dataset of aerial remote sensing and road traffic scenes.The results confirm that it can effectively deal with the problem of large-scale differences in cross-domain scenes.Its segmentation accuracy surpasses that of the SOTA methods,which meets the real-time requirements. 展开更多
关键词 Aviation-road traffic Flying cars Global context-aware Multi-scale feature iterative refinement Semantic segmentation
原文传递
基于特征相似日和CNN-BiLSTM-Attention模型的风电短期出力预测
11
作者 栾福明 张衡 陈海平 《动力工程学报》 北大核心 2026年第2期80-88,共9页
短期风电功率预测对电力系统的实时调度至关重要,可靠的风电预测不仅能够保障电力系统的安全运行,还能提升电网的运行效率。为了获得更加准确、可靠的风电功率预测结果,针对风电功率数据的非线性和时序性特征,提出了一种将特征相似日和C... 短期风电功率预测对电力系统的实时调度至关重要,可靠的风电预测不仅能够保障电力系统的安全运行,还能提升电网的运行效率。为了获得更加准确、可靠的风电功率预测结果,针对风电功率数据的非线性和时序性特征,提出了一种将特征相似日和CNN-BiLSTM-Attention相结合的短期风电功率预测方法。首先,充分考虑气象因素对风电输出功率数据的影响,利用Spearman相关系数筛选出与风电输出功率最相关的气象因子作为模型的输入参数。其次,采用高斯混合模型(GMM)对风电数据进行聚类分析,通过手肘法确定最佳的聚类簇数,并结合特征相似度和余弦相似熵方法,确定待测日与历史数据中最相关的聚类类型。最后,使用CNN-BiLSTM-Attention模型进行训练,深度挖掘风电功率的时序特征,获得更加精准的风电功率预测结果。以新疆地区的实际风电功率数据为例进行了仿真分析,验证结果表明该方法的预测精度较高,能够为电力系统的规划与稳定运行提供有力支持。 展开更多
关键词 风电功率 卷积神经网络 双向长短期记忆网络 注意力机制 电力系统
在线阅读 下载PDF
Context-Aware Relational Learning for Cooperative UAV Formation
12
作者 Zhuxun Li Haoxian Jiang Rui Zhou 《Journal of Beijing Institute of Technology》 2026年第1期44-52,共9页
Robust cooperative unmanned aerial vehicle(UAV)formation in complex 3D environments is hampered by reward sparsity and inefficient collaboration.To address this,we propose context-aware relational agent learning(CORAL... Robust cooperative unmanned aerial vehicle(UAV)formation in complex 3D environments is hampered by reward sparsity and inefficient collaboration.To address this,we propose context-aware relational agent learning(CORAL),a novel multi-agent deep reinforcement learning framework.CORAL synergistically integrates two modules:(1)a novelty-based intrinsic reward module to drive efficient exploration and(2)an explicit relational learning module that allows agents to predict peer intentions and enhance coordination.Built on a multi-agent Actor-Critic architecture,CORAL enables agents to balance self-interest with group objectives.Comprehensive evaluations in a high-fidelity simulation show that our method significantly outperforms state-of-theart baselines like multi-agent deep deterministic policy gradient(MADDPG)and monotonic value function factorisation for deep multi-agent reinforcement learning(QMIX)in path planning efficiency,collision avoidance,and scalability. 展开更多
关键词 multi-agent reinforcement learning UAV swarm cooperative formation control path planning context-aware exploration relational learning
在线阅读 下载PDF
基于CNN-BiLSTM-Attention的光伏发电功率预测研究
13
作者 朱峻嬉 郑淑娴 +3 位作者 金典 孙世康 冯靖瑶 陈仕军 《四川电力技术》 2026年第1期14-21,95,共9页
针对光伏功率输出的波动性及间歇性特征,提出了一种卷积神经网络(convolutional neural network,CNN)结合双向长短期记忆网络(bidirectional long short-term memory,BiLSTM)和注意力机制(Attention)的混合预测模型:先采用局部异常因子(... 针对光伏功率输出的波动性及间歇性特征,提出了一种卷积神经网络(convolutional neural network,CNN)结合双向长短期记忆网络(bidirectional long short-term memory,BiLSTM)和注意力机制(Attention)的混合预测模型:先采用局部异常因子(local outlier factor,LOF)算法检测与剔除功率数据中的异常数据,结合横向归一化方法消除量纲差异;再利用CNN捕捉局部空间特征、BiLSTM捕捉长期时序依赖,建立预测模型;最后在优化阶段引入Attention动态分配关键时间步的权重。为检验模型效果,选取某省级电网近3年的光伏发电功率数据进行实例分析。结果表明,所提CNN-BiLSTM-Attention预测模型的平均绝对误差、均方根误差和平均相对误差分别为0.02、0.04和0.06,可实现光伏发电的高精度功率预测,对优化电力调配与新能源消纳具有实际意义。 展开更多
关键词 光伏发电功率预测 数据归一化 LOF异常检测 CNN-BiLSTM-attention混合模型 注意力机制
在线阅读 下载PDF
GFL-SAR: Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement
14
作者 Hefei Wang Ruichun Gu +2 位作者 Jingyu Wang Xiaolin Zhang Hui Wei 《Computers, Materials & Continua》 2026年第1期1683-1702,共20页
Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,exi... Graph Federated Learning(GFL)has shown great potential in privacy protection and distributed intelligence through distributed collaborative training of graph-structured data without sharing raw information.However,existing GFL approaches often lack the capability for comprehensive feature extraction and adaptive optimization,particularly in non-independent and identically distributed(NON-IID)scenarios where balancing global structural understanding and local node-level detail remains a challenge.To this end,this paper proposes a novel framework called GFL-SAR(Graph Federated Collaborative Learning Framework Based on Structural Amplification and Attention Refinement),which enhances the representation learning capability of graph data through a dual-branch collaborative design.Specifically,we propose the Structural Insight Amplifier(SIA),which utilizes an improved Graph Convolutional Network(GCN)to strengthen structural awareness and improve modeling of topological patterns.In parallel,we propose the Attentive Relational Refiner(ARR),which employs an enhanced Graph Attention Network(GAT)to perform fine-grained modeling of node relationships and neighborhood features,thereby improving the expressiveness of local interactions and preserving critical contextual information.GFL-SAR effectively integrates multi-scale features from every branch via feature fusion and federated optimization,thereby addressing existing GFL limitations in structural modeling and feature representation.Experiments on standard benchmark datasets including Cora,Citeseer,Polblogs,and Cora_ML demonstrate that GFL-SAR achieves superior performance in classification accuracy,convergence speed,and robustness compared to existing methods,confirming its effectiveness and generalizability in GFL tasks. 展开更多
关键词 Graph federated learning GCN GNNs attention mechanism
在线阅读 下载PDF
DAUNet: Unsupervised Neural Network Based on Dual Attention for Clock Synchronization in Multi-Agent Wireless Ad Hoc Networks
15
作者 Haihao He Xianzhou Dong +2 位作者 Shuangshuang Wang Chengzhang Zhu Xiaotong Zhao 《Computers, Materials & Continua》 2026年第1期847-869,共23页
Clock synchronization has important applications in multi-agent collaboration(such as drone light shows,intelligent transportation systems,and game AI),group decision-making,and emergency rescue operations.Synchroniza... Clock synchronization has important applications in multi-agent collaboration(such as drone light shows,intelligent transportation systems,and game AI),group decision-making,and emergency rescue operations.Synchronization method based on pulse-coupled oscillators(PCOs)provides an effective solution for clock synchronization in wireless networks.However,the existing clock synchronization algorithms in multi-agent ad hoc networks are difficult to meet the requirements of high precision and high stability of synchronization clock in group cooperation.Hence,this paper constructs a network model,named DAUNet(unsupervised neural network based on dual attention),to enhance clock synchronization accuracy in multi-agent wireless ad hoc networks.Specifically,we design an unsupervised distributed neural network framework as the backbone,building upon classical PCO-based synchronization methods.This framework resolves issues such as prolonged time synchronization message exchange between nodes,difficulties in centralized node coordination,and challenges in distributed training.Furthermore,we introduce a dual-attention mechanism as the core module of DAUNet.By integrating a Multi-Head Attention module and a Gated Attention module,the model significantly improves information extraction capabilities while reducing computational complexity,effectively mitigating synchronization inaccuracies and instability in multi-agent ad hoc networks.To evaluate the effectiveness of the proposed model,comparative experiments and ablation studies were conducted against classical methods and existing deep learning models.The research results show that,compared with the deep learning networks based on DASA and LSTM,DAUNet can reduce the mean normalized phase difference(NPD)by 1 to 2 orders of magnitude.Compared with the attention models based on additive attention and self-attention mechanisms,the performance of DAUNet has improved by more than ten times.This study demonstrates DAUNet’s potential in advancing multi-agent ad hoc networking technologies. 展开更多
关键词 Clock synchronization deep learning dual attention mechanism pulse-coupled oscillator
在线阅读 下载PDF
Enhanced BEV Scene Segmentation:De-Noise Channel Attention for Resource-Constrained Environments
16
作者 Argho Dey Yunfei Yin +3 位作者 Zheng Yuan ZhiwenZeng Xianjian Bao Md Minhazul Islam 《Computers, Materials & Continua》 2026年第4期2161-2180,共20页
Autonomous vehicles rely heavily on accurate and efficient scene segmentation for safe navigation and efficient operations.Traditional Bird’s Eye View(BEV)methods on semantic scene segmentation,which leverage multimo... Autonomous vehicles rely heavily on accurate and efficient scene segmentation for safe navigation and efficient operations.Traditional Bird’s Eye View(BEV)methods on semantic scene segmentation,which leverage multimodal sensor fusion,often struggle with noisy data and demand high-performance GPUs,leading to sensor misalignment and performance degradation.This paper introduces an Enhanced Channel Attention BEV(ECABEV),a novel approach designed to address the challenges under insufficient GPU memory conditions.ECABEV integrates camera and radar data through a de-noise enhanced channel attention mechanism,which utilizes global average and max pooling to effectively filter out noise while preserving discriminative features.Furthermore,an improved fusion approach is proposed to efficiently merge categorical data across modalities.To reduce computational overhead,a bilinear interpolation layer normalizationmethod is devised to ensure spatial feature fidelity.Moreover,a scalable crossentropy loss function is further designed to handle the imbalanced classes with less computational efficiency sacrifice.Extensive experiments on the nuScenes dataset demonstrate that ECABEV achieves state-of-the-art performance with an IoU of 39.961,using a lightweight ViT-B/14 backbone and lower resolution(224×224).Our approach highlights its cost-effectiveness and practical applicability,even on low-end devices.The code is publicly available at:https://github.com/YYF-CQU/ECABEV.git. 展开更多
关键词 Autonomous vehicle BEV attention mechanism sensor fusion scene segmentation
在线阅读 下载PDF
An Attention-Based 6D Pose Estimation Network for Weakly Textured Industrial Parts
17
作者 Song Xu Liang Xuan +1 位作者 Yifeng Li Qiang Zhang 《Computers, Materials & Continua》 2026年第2期2148-2166,共19页
The 6D pose estimation of objects is of great significance for the intelligent assembly and sorting of industrial parts.In the industrial robot production scenarios,the 6D pose estimation of industrial parts mainly fa... The 6D pose estimation of objects is of great significance for the intelligent assembly and sorting of industrial parts.In the industrial robot production scenarios,the 6D pose estimation of industrial parts mainly faces two challenges:one is the loss of information and interference caused by occlusion and stacking in the sorting scenario,the other is the difficulty of feature extraction due to the weak texture of industrial parts.To address the above problems,this paper proposes an attention-based pixel-level voting network for 6D pose estimation of weakly textured industrial parts,namely CB-PVNet.On the one hand,the voting scheme can predict the keypoints of affected pixels,which improves the accuracy of keypoint localization even in scenarios such as weak texture and partial occlusion.On the other hand,the attention mechanism can extract interesting features of the object while suppressing useless features of surroundings.Extensive comparative experiments were conducted on both public datasets(including LINEMOD,Occlusion LINEMOD and T-LESS datasets)and self-made datasets.The experimental results indicate that the proposed network CB-PVNet can achieve accuracy of ADD(-s)comparable to state-of-the-art using only RGB images while ensuring real-time performance.Additionally,we also conducted robot grasping experiments in the real world.The balance between accuracy and computational efficiency makes the method well-suited for applications in industrial automation. 展开更多
关键词 Industrial robots pose estimation industrial parts attention mechanism weak texture
在线阅读 下载PDF
BAID:A Lightweight Super-Resolution Network with Binary Attention-Guided Frequency-Aware Information Distillation
18
作者 Jiajia Liu Junyi Lin +3 位作者 Wenxiang Dong Xuan Zhao Jianhua Liu Huiru Li 《Computers, Materials & Continua》 2026年第2期1190-1208,共19页
Single Image Super-Resolution(SISR)seeks to reconstruct high-resolution(HR)images from lowresolution(LR)inputs,thereby enhancing visual fidelity and the perception of fine details.While Transformer-based models—such ... Single Image Super-Resolution(SISR)seeks to reconstruct high-resolution(HR)images from lowresolution(LR)inputs,thereby enhancing visual fidelity and the perception of fine details.While Transformer-based models—such as SwinIR,Restormer,and HAT—have recently achieved impressive results in super-resolution tasks by capturing global contextual information,these methods often suffer from substantial computational and memory overhead,which limits their deployment on resource-constrained edge devices.To address these challenges,we propose a novel lightweight super-resolution network,termed Binary Attention-Guided Information Distillation(BAID),which integrates frequency-aware modeling with a binary attention mechanism to significantly reduce computational complexity and parameter count whilemaintaining strong reconstruction performance.The network combines a high–low frequency decoupling strategy with a local–global attention sharing mechanism,enabling efficient compression of redundant computations through binary attention guidance.At the core of the architecture lies the Attention-Guided Distillation Block(AGDB),which retains the strengths of the information distillation framework while introducing a sparse binary attention module to enhance both inference efficiency and feature representation.Extensive×4 superresolution experiments on four standard benchmarks—Set5,Set14,BSD100,and Urban100—demonstrate that BAID achieves Peak Signal-to-Noise Ratio(PSNR)values of 32.13,28.51,27.47,and 26.15,respectively,with only 1.22 million parameters and 26.1 G Floating-Point Operations(FLOPs),outperforming other state-of-the-art lightweight methods such as Information Multi-Distillation Network(IMDN)and Residual Feature Distillation Network(RFDN).These results highlight the proposed model’s ability to deliver high-quality image reconstruction while offering strong deployment efficiency,making it well-suited for image restoration tasks in resource-limited environments. 展开更多
关键词 Single image super-resolution lightweight network binary attention information distillation
在线阅读 下载PDF
Semantic-Guided Stereo Matching Network Based on Parallax Attention Mechanism and Seg Former
19
作者 Zeyuan Chen Yafei Xie +2 位作者 Jinkun Li Song Wang Yingqiang Ding 《Computers, Materials & Continua》 2026年第4期1322-1340,共19页
Stereo matching is a pivotal task in computer vision,enabling precise depth estimation from stereo image pairs,yet it encounters challenges in regions with reflections,repetitive textures,or fine structures.In this pa... Stereo matching is a pivotal task in computer vision,enabling precise depth estimation from stereo image pairs,yet it encounters challenges in regions with reflections,repetitive textures,or fine structures.In this paper,we propose a Semantic-Guided Parallax Attention Stereo Matching Network(SGPASMnet)that can be trained in unsupervised manner,building upon the Parallax Attention Stereo Matching Network(PASMnet).Our approach leverages unsupervised learning to address the scarcity of ground truth disparity in stereo matching datasets,facilitating robust training across diverse scene-specific datasets and enhancing generalization.SGPASMnet incorporates two novel components:a Cross-Scale Feature Interaction(CSFI)block and semantic feature augmentation using a pre-trained semantic segmentation model,SegFormer,seamlessly embedded into the parallax attention mechanism.The CSFI block enables effective fusion ofmulti-scale features,integrating coarse and fine details to enhance disparity estimation accuracy.Semantic features,extracted by SegFormer,enrich the parallax attention mechanism by providing high-level scene context,significantly improving performance in ambiguous regions.Our model unifies these enhancements within a cohesive architecture,comprising semantic feature extraction,an hourglass network,a semantic-guided cascaded parallax attentionmodule,outputmodule,and a disparity refinement network.Evaluations on the KITTI2015 dataset demonstrate that our unsupervised method achieves a lower error rate compared to the original PASMnet,highlighting the effectiveness of our enhancements in handling complex scenes.By harnessing unsupervised learning without ground truth disparity needed,SGPASMnet offers a scalable and robust solution for accurate stereo matching,with superior generalization across varied real-world applications. 展开更多
关键词 Stereo matching parallax attention unsupervised learning convolutional neural network stereo correspondence
在线阅读 下载PDF
YOLO-SPDNet:Multi-Scale Sequence and Attention-Based Tomato Leaf Disease Detection Model
20
作者 Meng Wang Jinghan Cai +6 位作者 Wenzheng Liu Xue Yang Jingjing Zhang Qiangmin Zhou Fanzhen Wang Hang Zhang Tonghai Liu 《Phyton-International Journal of Experimental Botany》 2026年第1期290-308,共19页
Tomato is a major economic crop worldwide,and diseases on tomato leaves can significantly reduce both yield and quality.Traditional manual inspection is inefficient and highly subjective,making it difficult to meet th... Tomato is a major economic crop worldwide,and diseases on tomato leaves can significantly reduce both yield and quality.Traditional manual inspection is inefficient and highly subjective,making it difficult to meet the requirements of early disease identification in complex natural environments.To address this issue,this study proposes an improved YOLO11-based model,YOLO-SPDNet(Scale Sequence Fusion,Position-Channel Attention,and Dual Enhancement Network).The model integrates the SEAM(Self-Ensembling Attention Mechanism)semantic enhancement module,the MLCA(Mixed Local Channel Attention)lightweight attention mechanism,and the SPA(Scale-Position-Detail Awareness)module composed of SSFF(Scale Sequence Feature Fusion),TFE(Triple Feature Encoding),and CPAM(Channel and Position Attention Mechanism).These enhancements strengthen fine-grained lesion detection while maintaining model lightweightness.Experimental results show that YOLO-SPDNet achieves an accuracy of 91.8%,a recall of 86.5%,and an mAP@0.5 of 90.6%on the test set,with a computational complexity of 12.5 GFLOPs.Furthermore,the model reaches a real-time inference speed of 987 FPS,making it suitable for deployment on mobile agricultural terminals and online monitoring systems.Comparative analysis and ablation studies further validate the reliability and practical applicability of the proposed model in complex natural scenes. 展开更多
关键词 Tomato disease detection YOLO multi-scale feature fusion attention mechanism lightweight model
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部