期刊文献+
共找到373篇文章
< 1 2 19 >
每页显示 20 50 100
Joint Feature Encoding and Task Alignment Mechanism for Emotion-Cause Pair Extraction
1
作者 Shi Li Didi Sun 《Computers, Materials & Continua》 SCIE EI 2025年第1期1069-1086,共18页
With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions... With the rapid expansion of social media,analyzing emotions and their causes in texts has gained significant importance.Emotion-cause pair extraction enables the identification of causal relationships between emotions and their triggers within a text,facilitating a deeper understanding of expressed sentiments and their underlying reasons.This comprehension is crucial for making informed strategic decisions in various business and societal contexts.However,recent research approaches employing multi-task learning frameworks for modeling often face challenges such as the inability to simultaneouslymodel extracted features and their interactions,or inconsistencies in label prediction between emotion-cause pair extraction and independent assistant tasks like emotion and cause extraction.To address these issues,this study proposes an emotion-cause pair extraction methodology that incorporates joint feature encoding and task alignment mechanisms.The model consists of two primary components:First,joint feature encoding simultaneously generates features for emotion-cause pairs and clauses,enhancing feature interactions between emotion clauses,cause clauses,and emotion-cause pairs.Second,the task alignment technique is applied to reduce the labeling distance between emotion-cause pair extraction and the two assistant tasks,capturing deep semantic information interactions among tasks.The proposed method is evaluated on a Chinese benchmark corpus using 10-fold cross-validation,assessing key performance metrics such as precision,recall,and F1 score.Experimental results demonstrate that the model achieves an F1 score of 76.05%,surpassing the state-of-the-art by 1.03%.The proposed model exhibits significant improvements in emotion-cause pair extraction(ECPE)and cause extraction(CE)compared to existing methods,validating its effectiveness.This research introduces a novel approach based on joint feature encoding and task alignment mechanisms,contributing to advancements in emotion-cause pair extraction.However,the study’s limitation lies in the data sources,potentially restricting the generalizability of the findings. 展开更多
关键词 Emotion-cause pair extraction interactive information enhancement joint feature encoding label consistency task alignment mechanisms
在线阅读 下载PDF
Self-FAGCFN:Graph-Convolution Fusion Network Based on Feature Fusion and Self-Supervised Feature Alignment for Pneumonia and Tuberculosis Diagnosis
2
作者 Junding Sun Wenhao Tang +5 位作者 Lei Zhao Chaosheng Tang Xiaosheng Wu Zhaozhao Xu Bin Pu Yudong Zhang 《Journal of Bionic Engineering》 2025年第4期2012-2029,共18页
Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely us... Feature fusion is an important technique in medical image classification that can improve diagnostic accuracy by integrating complementary information from multiple sources.Recently,Deep Learning(DL)has been widely used in pulmonary disease diagnosis,such as pneumonia and tuberculosis.However,traditional feature fusion methods often suffer from feature disparity,information loss,redundancy,and increased complexity,hindering the further extension of DL algorithms.To solve this problem,we propose a Graph-Convolution Fusion Network with Self-Supervised Feature Alignment(Self-FAGCFN)to address the limitations of traditional feature fusion methods in deep learning-based medical image classification for respiratory diseases such as pneumonia and tuberculosis.The network integrates Convolutional Neural Networks(CNNs)for robust feature extraction from two-dimensional grid structures and Graph Convolutional Networks(GCNs)within a Graph Neural Network branch to capture features based on graph structure,focusing on significant node representations.Additionally,an Attention-Embedding Ensemble Block is included to capture critical features from GCN outputs.To ensure effective feature alignment between pre-and post-fusion stages,we introduce a feature alignment loss that minimizes disparities.Moreover,to address the limitations of proposed methods,such as inappropriate centroid discrepancies during feature alignment and class imbalance in the dataset,we develop a Feature-Centroid Fusion(FCF)strategy and a Multi-Level Feature-Centroid Update(MLFCU)algorithm,respectively.Extensive experiments on public datasets LungVision and Chest-Xray demonstrate that the Self-FAGCFN model significantly outperforms existing methods in diagnosing pneumonia and tuberculosis,highlighting its potential for practical medical applications. 展开更多
关键词 feature fusion Self-supervised feature alignment Convolutional neural networks Graph convolutional networks Class imbalance feature-centroid fusion
在线阅读 下载PDF
Hierarchical Optimization Method for Federated Learning with Feature Alignment and Decision Fusion
3
作者 Ke Li Xiaofeng Wang Hu Wang 《Computers, Materials & Continua》 SCIE EI 2024年第10期1391-1407,共17页
In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate... In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate accuracy of the global model.Utilizing shared feature representations alongside customized classifiers for individual clients emerges as a promising personalized solution.Nonetheless,previous research has frequently neglected the integration of global knowledge into local representation learning and the synergy between global and local classifiers,thereby limiting model performance.To tackle these issues,this study proposes a hierarchical optimization method for federated learning with feature alignment and the fusion of classification decisions(FedFCD).FedFCD regularizes the relationship between global and local feature representations to achieve alignment and incorporates decision information from the global classifier,facilitating the late fusion of decision outputs from both global and local classifiers.Additionally,FedFCD employs a hierarchical optimization strategy to flexibly optimize model parameters.Through experiments on the Fashion-MNIST,CIFAR-10 and CIFAR-100 datasets,we demonstrate the effectiveness and superiority of FedFCD.For instance,on the CIFAR-100 dataset,FedFCD exhibited a significant improvement in average test accuracy by 6.83%compared to four outstanding personalized federated learning approaches.Furthermore,extended experiments confirm the robustness of FedFCD across various hyperparameter values. 展开更多
关键词 Federated learning data heterogeneity feature alignment decision fusion hierarchical optimization
在线阅读 下载PDF
Feature pyramid attention network for audio-visual scene classification 被引量:1
4
作者 Liguang Zhou Yuhongze Zhou +3 位作者 Xiaonan Qi Junjie Hu Tin Lun Lam Yangsheng Xu 《CAAI Transactions on Intelligence Technology》 2025年第2期359-374,共16页
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text... Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals. 展开更多
关键词 dimension alignment feature pyramid attention network pyramid channel attention pyramid spatial attention semantic relevant regions
在线阅读 下载PDF
VTAN: A Novel Video Transformer Attention-Based Network for Dynamic Sign Language Recognition
5
作者 Ziyang Deng Weidong Min +2 位作者 Qing Han Mengxue Liu Longfei Li 《Computers, Materials & Continua》 2025年第2期2793-2812,共20页
Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dyn... Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dynamic sign language requires identifying keyframes that best represent the signs, and missing these keyframes reduces accuracy. Secondly, some methods do not focus enough on hand regions, which are small within the overall frame, leading to information loss. To address these challenges, we propose a novel Video Transformer Attention-based Network (VTAN) for dynamic sign language recognition. Our approach prioritizes informative frames and hand regions effectively. To tackle the first issue, we designed a keyframe extraction module enhanced by a convolutional autoencoder, which focuses on selecting information-rich frames and eliminating redundant ones from the video sequences. For the second issue, we developed a soft attention-based transformer module that emphasizes extracting features from hand regions, ensuring that the network pays more attention to hand information within sequences. This dual-focus approach improves effective dynamic sign language recognition by addressing the key challenges of identifying critical frames and emphasizing hand regions. Experimental results on two public benchmark datasets demonstrate the effectiveness of our network, outperforming most of the typical methods in sign language recognition tasks. 展开更多
关键词 Dynamic sign language recognition TRANSFORMER soft attention attention-based visual feature aggregation
在线阅读 下载PDF
Multi-Modal Pre-Synergistic Fusion Entity Alignment Based on Mutual Information Strategy Optimization
6
作者 Huayu Li Xinxin Chen +3 位作者 Lizhuang Tan Konstantin I.Kostromitin Athanasios V.Vasilakos Peiying Zhang 《Computers, Materials & Continua》 2025年第11期4133-4153,共21页
To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities... To address the challenge of missing modal information in entity alignment and to mitigate information loss or bias arising frommodal heterogeneity during fusion,while also capturing shared information acrossmodalities,this paper proposes a Multi-modal Pre-synergistic Entity Alignmentmodel based on Cross-modalMutual Information Strategy Optimization(MPSEA).The model first employs independent encoders to process multi-modal features,including text,images,and numerical values.Next,a multi-modal pre-synergistic fusion mechanism integrates graph structural and visual modal features into the textual modality as preparatory information.This pre-fusion strategy enables unified perception of heterogeneous modalities at the model’s initial stage,reducing discrepancies during the fusion process.Finally,using cross-modal deep perception reinforcement learning,the model achieves adaptive multilevel feature fusion between modalities,supporting learningmore effective alignment strategies.Extensive experiments on multiple public datasets show that the MPSEA method achieves gains of up to 7% in Hits@1 and 8.2% in MRR on the FBDB15K dataset,and up to 9.1% in Hits@1 and 7.7% in MRR on the FBYG15K dataset,compared to existing state-of-the-art methods.These results confirm the effectiveness of the proposed model. 展开更多
关键词 Knowledge graph MULTI-MODAL entity alignment feature fusion pre-synergistic fusion
在线阅读 下载PDF
A Dual Stream Multimodal Alignment and Fusion Network for Classifying Short Videos
7
作者 ZHOU Ming WANG Tong 《Journal of Donghua University(English Edition)》 2025年第1期88-95,共8页
Video classification is an important task in video understanding and plays a pivotal role in intelligent monitoring of information content.Most existing methods do not consider the multimodal nature of the video,and t... Video classification is an important task in video understanding and plays a pivotal role in intelligent monitoring of information content.Most existing methods do not consider the multimodal nature of the video,and the modality fusion approach tends to be too simple,often neglecting modality alignment before fusion.This research introduces a novel dual stream multimodal alignment and fusion network named DMAFNet for classifying short videos.The network uses two unimodal encoder modules to extract features within modalities and exploits a multimodal encoder module to learn interaction between modalities.To solve the modality alignment problem,contrastive learning is introduced between two unimodal encoder modules.Additionally,masked language modeling(MLM)and video text matching(VTM)auxiliary tasks are introduced to improve the interaction between video frames and text modalities through backpropagation of loss functions.Diverse experiments prove the efficiency of DMAFNet in multimodal video classification tasks.Compared with other two mainstream baselines,DMAFNet achieves the best results on the 2022 WeChat Big Data Challenge dataset. 展开更多
关键词 video classification multimodal fusion feature alignment
在线阅读 下载PDF
Feature Extraction of Kernel Regress Reconstruction for Fault Diagnosis Based on Self-organizing Manifold Learning 被引量:3
8
作者 CHEN Xiaoguang LIANG Lin +1 位作者 XU Guanghua LIU Dan 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2013年第5期1041-1049,共9页
The feature space extracted from vibration signals with various faults is often nonlinear and of high dimension.Currently,nonlinear dimensionality reduction methods are available for extracting low-dimensional embeddi... The feature space extracted from vibration signals with various faults is often nonlinear and of high dimension.Currently,nonlinear dimensionality reduction methods are available for extracting low-dimensional embeddings,such as manifold learning.However,these methods are all based on manual intervention,which have some shortages in stability,and suppressing the disturbance noise.To extract features automatically,a manifold learning method with self-organization mapping is introduced for the first time.Under the non-uniform sample distribution reconstructed by the phase space,the expectation maximization(EM) iteration algorithm is used to divide the local neighborhoods adaptively without manual intervention.After that,the local tangent space alignment(LTSA) algorithm is adopted to compress the high-dimensional phase space into a more truthful low-dimensional representation.Finally,the signal is reconstructed by the kernel regression.Several typical states include the Lorenz system,engine fault with piston pin defect,and bearing fault with outer-race defect are analyzed.Compared with the LTSA and continuous wavelet transform,the results show that the background noise can be fully restrained and the entire periodic repetition of impact components is well separated and identified.A new way to automatically and precisely extract the impulsive components from mechanical signals is proposed. 展开更多
关键词 feature extraction manifold learning self-organize mapping kernel regression local tangent space alignment
在线阅读 下载PDF
Class conditional distribution alignment for domain adaptation 被引量:2
9
作者 Kai CAO Zhipeng TU Yang MING 《Control Theory and Technology》 EI CSCD 2020年第1期72-80,共9页
In this paper,we study the problem of domain adaptation,which is a crucial ingredient in transfer learning with two domains,that is,the source domain with labeled data and the target domain with none or few labels.Dom... In this paper,we study the problem of domain adaptation,which is a crucial ingredient in transfer learning with two domains,that is,the source domain with labeled data and the target domain with none or few labels.Domain adaptation aims to extract knowledge from the source domain to improve the performance of the learning task in the target domain.A popular approach to handle this problem is via adversarial training,which is explained by the H△H-distance theory.However,traditional adversarial network architectures just align the marginal feature distribution in the feature space.The alignment of class condition distribution is not guaranteed.Therefore,we proposed a novel method based on pseudo labels and the cluster assumption to avoid the incorrect class alignment in the feature space.The experiments demonstrate that our framework improves the accuracy on typical transfer learning tasks. 展开更多
关键词 DOMAIN ADAPTATION distribution alignment feature CLUSTER
原文传递
A Power Data Anomaly Detection Model Based on Deep Learning with Adaptive Feature Fusion
10
作者 Xiu Liu Liang Gu +3 位作者 Xin Gong Long An Xurui Gao Juying Wu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4045-4061,共17页
With the popularisation of intelligent power,power devices have different shapes,numbers and specifications.This means that the power data has distributional variability,the model learning process cannot achieve suffi... With the popularisation of intelligent power,power devices have different shapes,numbers and specifications.This means that the power data has distributional variability,the model learning process cannot achieve sufficient extraction of data features,which seriously affects the accuracy and performance of anomaly detection.Therefore,this paper proposes a deep learning-based anomaly detection model for power data,which integrates a data alignment enhancement technique based on random sampling and an adaptive feature fusion method leveraging dimension reduction.Aiming at the distribution variability of power data,this paper developed a sliding window-based data adjustment method for this model,which solves the problem of high-dimensional feature noise and low-dimensional missing data.To address the problem of insufficient feature fusion,an adaptive feature fusion method based on feature dimension reduction and dictionary learning is proposed to improve the anomaly data detection accuracy of the model.In order to verify the effectiveness of the proposed method,we conducted effectiveness comparisons through elimination experiments.The experimental results show that compared with the traditional anomaly detection methods,the method proposed in this paper not only has an advantage in model accuracy,but also reduces the amount of parameter calculation of the model in the process of feature matching and improves the detection speed. 展开更多
关键词 Data alignment dimension reduction feature fusion data anomaly detection deep learning
在线阅读 下载PDF
基于SMPL模态分解与嵌入融合的多模态步态识别
11
作者 吴越 梁铮 +4 位作者 高巍 杨茂达 赵培森 邓红霞 常媛媛 《浙江大学学报(工学版)》 北大核心 2026年第1期52-60,共9页
针对现有步态识别研究中步态信息挖掘不足和跨模态特征对齐不充分导致真实场景中识别性能受限的问题,提出基于蒙皮多人线性(SMPL)模态分解与嵌入融合的多模态步态识别方法.通过将SMPL模型分解为形状分支和姿势分支,全面提取人体静态形... 针对现有步态识别研究中步态信息挖掘不足和跨模态特征对齐不充分导致真实场景中识别性能受限的问题,提出基于蒙皮多人线性(SMPL)模态分解与嵌入融合的多模态步态识别方法.通过将SMPL模型分解为形状分支和姿势分支,全面提取人体静态形状特征和动态运动特征;构建自适应帧关节注意力模块,自适应聚焦关键帧与重要关节,增强姿势特征表达能力;设计模态嵌入融合模块,将不同模态特征投影至统一语义空间,并构建模态一致性损失函数,优化跨模态特征对齐,提升融合效果.在Gait3D数据集上的实验结果表明,与6种基于轮廓的方法、2种基于骨骼的方法以及5种基于轮廓和骨骼或SMPL模型的多模态方法比较,所提方法 Rank-1准确率达到70.4%,在复杂真实场景中表现出更高鲁棒性,验证了所提方法在模态特征提取和跨模态特征对齐方面的有效性. 展开更多
关键词 步态识别 SMPL模型 自适应注意力 特征对齐 模态融合
在线阅读 下载PDF
Protein Residue Contact Prediction Based on Deep Learning and Massive Statistical Features from Multi-Sequence Alignment
12
作者 Huiling Zhang Min Hao +4 位作者 Hao Wu Hing-Fung Ting Yihong Tang Wenhui Xi Yanjie Wei 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第5期843-854,共12页
Sequence-based protein tertiary structure prediction is of fundamental importance because the function of a protein ultimately depends on its 3 D structure.An accurate residue-residue contact map is one of the essenti... Sequence-based protein tertiary structure prediction is of fundamental importance because the function of a protein ultimately depends on its 3 D structure.An accurate residue-residue contact map is one of the essential elements for current ab initio prediction protocols of 3 D structure prediction.Recently,with the combination of deep learning and direct coupling techniques,the performance of residue contact prediction has achieved significant progress.However,a considerable number of current Deep-Learning(DL)-based prediction methods are usually time-consuming,mainly because they rely on different categories of data types and third-party programs.In this research,we transformed the complex biological problem into a pure computational problem through statistics and artificial intelligence.We have accordingly proposed a feature extraction method to obtain various categories of statistical information from only the multi-sequence alignment,followed by training a DL model for residue-residue contact prediction based on the massive statistical information.The proposed method is robust in terms of different test sets,showed high reliability on model confidence score,could obtain high computational efficiency and achieve comparable prediction precisions with DL methods that relying on multi-source inputs. 展开更多
关键词 multi-sequence alignment residue-residue contact prediction feature extraction statistical information Deep Learning(DL) high computational efficiency
原文传递
面向多模数据的引导-对齐情绪推理方法
13
作者 张艳 夏雨琪 +2 位作者 丁凯 刘阳炀 王年 《东南大学学报(自然科学版)》 北大核心 2025年第2期585-592,共8页
表情和声音等微观情绪需近距离交互采集。为了将空间尺度大、数据容易获取的姿态信息作为情绪表达的载体,提出一种基于引导-对齐模块的情绪推理方法。其中引导模块借助面部关键点指导姿态特征的提取,进行帧图像二级筛选;首先提取出同时... 表情和声音等微观情绪需近距离交互采集。为了将空间尺度大、数据容易获取的姿态信息作为情绪表达的载体,提出一种基于引导-对齐模块的情绪推理方法。其中引导模块借助面部关键点指导姿态特征的提取,进行帧图像二级筛选;首先提取出同时包含面部关键点和人体姿态的帧图像,通过对每帧图像的欧氏度量筛选保留符合要求的人体姿态帧图像,实现面部特征引导姿态特征的提取;通过特征对数归一化实现姿态对齐模块,姿态特征与面部特征、环境特征共同构成视觉特征,将视觉特征、文本特征和语音特征进行多模态特征融合。实验结果表明,该方法在MEmoR数据集上的Micro⁃F_(1)达到48.86%,一定程度上提升了多模态情绪推理能力。 展开更多
关键词 情绪推理 多模态 特征对数对齐 引导特征
在线阅读 下载PDF
高速公路纵坡和驾驶经验对驾驶员生理和心理的影响
14
作者 朱顺应 周知星 +3 位作者 吴景安 陈秋成 王红 杨辰宇 《重庆理工大学学报(自然科学)》 北大核心 2025年第6期238-244,共7页
为研究高速公路纵坡与驾驶经验对驾驶员生理和心理反应的影响,根据驾龄将驾驶员划分为经验不足组与经验丰富组,采用实车试验的方式,在鄂咸高速公路不同坡度(0、-1%、-1.5%、-2.5%)的路段进行数据采集。通过记录的驾驶员眼动参数和心率数... 为研究高速公路纵坡与驾驶经验对驾驶员生理和心理反应的影响,根据驾龄将驾驶员划分为经验不足组与经验丰富组,采用实车试验的方式,在鄂咸高速公路不同坡度(0、-1%、-1.5%、-2.5%)的路段进行数据采集。通过记录的驾驶员眼动参数和心率数据,选取瞳孔面积、平均注视时间和心率增长率作为驾驶员生理和心理反应的指标。利用双因素方差分析确定影响驾驶员生理与心理反应的主要因素,并采用趋势面模型进一步探究了生理和心理反应之间的关系。研究结果表明:路段坡度和驾驶经验对以上选取的指标具有显著影响(P<0.05);随着坡度的增加,2组驾驶员的瞳孔面积、平均注视时间和心率增长率均逐渐增大。在相同坡度下,经验丰富的驾驶员在上述生理与心理指标上普遍低于经验不足组,且变异性较小;趋势面拟合结果表明,纵坡路段驾驶时,驾驶员的生理和心理反应密切相关。 展开更多
关键词 交通工程 高速公路 道路线形 驾驶经验 生理特征 心理特征 趋势面拟合
在线阅读 下载PDF
特征对齐与联合深度矩阵分解同步的跨域推荐
15
作者 胡建华 谢雯 +1 位作者 宋燕 宇振盛 《小型微型计算机系统》 北大核心 2025年第11期2617-2624,共8页
跨域推荐有效地缓解了推荐系统中的数据稀疏和冷启动问题,但同时也面临着不同领域间用户偏好的异质性以及领域差异性带来的挑战.因此,如何建模用户偏好、挖掘各领域的潜在特征,并有效地迁移共享知识,成为提高推荐效果的重要课题.本文在... 跨域推荐有效地缓解了推荐系统中的数据稀疏和冷启动问题,但同时也面临着不同领域间用户偏好的异质性以及领域差异性带来的挑战.因此,如何建模用户偏好、挖掘各领域的潜在特征,并有效地迁移共享知识,成为提高推荐效果的重要课题.本文在部分用户重叠的场景下,提出了一种基于特征对齐的深度潜在因子跨域推荐模型(DLFCDR),该模型实现了特征对齐与联合矩阵分解同步.模型通过分块形式的用户因子矩阵,捕捉重叠用户和非重叠用户的特征;同时,从类-子类的层级角度细分项目的特征空间,学习项目深层次的特征表示.通过映射对齐源域和目标域中项目各层的特征,实现领域间的自适应.此外,模型采用联合矩阵分解形式的协同过滤来实现知识共享.本文采用自适应的交替投影梯度算法来更新各变量,并在真实数据集上进行了3个任务的实验.结果表明,与对比模型相比,新模型的效果至少提升了7.46%,验证了新模型的有效性. 展开更多
关键词 跨域推荐 域自适应 用户部分重叠 潜在因子 特征对齐
在线阅读 下载PDF
基于LiteTS-YOLO的交通标志检测 被引量:1
16
作者 李冰 朱孝峰 +1 位作者 管嘉俊 王艳芳 《自动化与仪表》 2025年第1期82-89,94,共9页
针对交通标志检测精度低、漏检误检率高及传统模型体积大的问题,提出LiteTS YOLO算法。通过构建C_(2)f_FA模块,结合FasterNet优化参数量与计算复杂度,并引入高效多尺度注意力(EMA)机制以保留小目标特征;重新设计特征提取与融合网络,优... 针对交通标志检测精度低、漏检误检率高及传统模型体积大的问题,提出LiteTS YOLO算法。通过构建C_(2)f_FA模块,结合FasterNet优化参数量与计算复杂度,并引入高效多尺度注意力(EMA)机制以保留小目标特征;重新设计特征提取与融合网络,优化检测层架构以减少参数量并增强信息整合能力;设计SAPD Head检测头,集成高级任务分解与动态对齐机制,有效降低误检与漏检率,同时进一步减少参数量。实验结果显示,LiteTS-YOLO在自制TTT100K数据集上的m AP@0.5提升7.9%,参数量减少66.4%,模型大小减小65%,在检测精度与轻量化方面均实现显著改进。 展开更多
关键词 YOLOv8s 交通标志检测 动态特征对齐 高效多尺度注意力
在线阅读 下载PDF
综采工作面刮板输送机煤流轮廓点云的配准方法研究
17
作者 汪卫兵 李开放 +4 位作者 赵栓峰 王渊 路正雄 李赖 郭帅 《现代电子技术》 北大核心 2025年第16期81-87,共7页
针对综采工作面刮板输送机煤流轮廓点云噪声点多、轮廓结构复杂的特性和现有的点云配准算法无法适应煤流点云的快速和高精度配准问题,来对传统迭代最近点配准算法进行了改进。引入主成分分析法对待配准点云进行轴向初始对齐,采用尺度不... 针对综采工作面刮板输送机煤流轮廓点云噪声点多、轮廓结构复杂的特性和现有的点云配准算法无法适应煤流点云的快速和高精度配准问题,来对传统迭代最近点配准算法进行了改进。引入主成分分析法对待配准点云进行轴向初始对齐,采用尺度不变特征变换算法来提取待配准点云的特征点,构建快速点特征直方图,以确保两个点云主轴不会出现反向的情况,提高了粗配准算法的效率。通过随机抽样一致性初始配准算法搜索对应点对并计算初始刚体变换矩阵,用于实现两个点云的初步配准,为后续的精配准提供良好的初始位置。在上述粗配准的基础上,利用K-D树数据结构加速对应点的查找过程,并采用点到面的最小距离方法来提高对应关系的准确性。通过随机抽样一致算法迭代剔除错误的对应点对,以增强配准的准确性。最后,根据精确的对应点对计算刚体变换矩阵,从而实现对煤流点云数据的精细配准。实验结果表明,与其他点云配准方法相比,提出的改进配准算法在刮板输送机煤流轮廓点云的匹配精度和匹配效率上得到了提高,对煤流轮廓点云的体积计算具有重大意义。 展开更多
关键词 刮板输送机 煤流轮廓点云 点云配准 主成分分析法 尺度不变特征变换 随机抽样一致算法
在线阅读 下载PDF
具有特征交互适应的3D双手网格重建方法
18
作者 刘佳 张家辉 陈大鹏 《信号处理》 北大核心 2025年第7期1291-1302,共12页
从单张RGB图像中实现双手的3D交互式网格重建是一项极具挑战性的任务。由于双手之间的相互遮挡以及局部外观相似性较高,导致部分特征提取不够准确,从而丢失了双手之间的交互信息并使重建的手部网格与输入图像出现不对齐等问题。为了解... 从单张RGB图像中实现双手的3D交互式网格重建是一项极具挑战性的任务。由于双手之间的相互遮挡以及局部外观相似性较高,导致部分特征提取不够准确,从而丢失了双手之间的交互信息并使重建的手部网格与输入图像出现不对齐等问题。为了解决上述问题,本文首先提出一种包含两个部分的特征交互适应模块,第一部分特征交互在保留左右手分离特征的同时生成两种新的特征表示,并通过交互注意力模块捕获双手的交互特征;第二部分特征适应则是将此交互特征利用交互注意力模块适应到每只手,为左右手特征注入全局上下文信息。其次,引入三层图卷积细化网络结构用于精确回归双手网格顶点,并通过基于注意力机制的特征对齐模块增强顶点特征和图像特征的对齐,从而增强重建的手部网格和输入图像的对齐。同时提出一种新的多层感知机结构,通过下采样和上采样操作学习多尺度特征信息。最后,设计相对偏移损失函数约束双手的空间关系。在InterHand2.6M数据集上的定量和定性实验表明,与现有的优秀方法相比,所提出的方法显著提升了模型性能,其中平均每关节位置误差(Mean Per Joint Position Error,MPJPE)和平均每顶点位置误差(Mean Per Vertex Position Error,MPVPE)分别降低至7.19 mm和7.33 mm。此外,在RGB2Hands和EgoHands数据集上进行泛化性实验,定性实验结果表明所提出的方法具有良好的泛化能力,能够适应不同环境背景下的手部网格重建。 展开更多
关键词 双手重建 注意力机制 特征交互适应 特征对齐 图卷积网络
在线阅读 下载PDF
结构感知增强与跨模态融合的文本图像超分辨率
19
作者 朱仲杰 张磊 +3 位作者 李沛 屠仁伟 白永强 王玉儿 《中国图象图形学报》 北大核心 2025年第5期1364-1376,共13页
目的 场景文本图像超分辨率是一种新兴的视觉增强技术,用于提升低分辨率文本图像的分辨率,从而提高文本可读性。然而,现有方法无法有效提取文本结构动态特征,导致形成的语义先验无法与图像特征有效对齐并融合,进而影响图像重建质量并造... 目的 场景文本图像超分辨率是一种新兴的视觉增强技术,用于提升低分辨率文本图像的分辨率,从而提高文本可读性。然而,现有方法无法有效提取文本结构动态特征,导致形成的语义先验无法与图像特征有效对齐并融合,进而影响图像重建质量并造成文本识别困难。为此,提出一种基于文本结构动态感知的跨模态融合超分辨率方法以提高文本图像质量和文本可读性。方法 首先,构建文本结构动态感知模块,通过方向感知层和上下文关联单元,分别提取文本的多尺度定向特征并解析字符邻域间的上下文联系,精准捕获文本图像的结构动态特征;其次,设计语义空间对齐模块,利用文本掩码信息促进精细化文本语义先验的生成,并通过仿射变换对齐语义先验和图像特征;最后,在此基础上,通过跨模态融合模块结合文本语义先验与图像特征,以自适应权重分配的方式促进跨模态交互融合,输出高分辨率文本图像。结果 在真实数据集TextZoom上与多种主流方法进行对比,实验结果表明所提方法在ASTER(attentional scene text recognizer)、CRNN(convolutional recurrent neural network)和MORAN(multiobject rectified attention network)3种文本识别器上的平均识别精度为62.4%,较性能第2的方法有2.8%的提升。此外,所提方法的峰值信噪比(peak signal-to-noise ratio,PSNR)和结构相似性(structural similarity index,SSIM)指标分别为21.9 dB和0.789,分别处于第1名和第2名的位置,领先大多数方法。结论 所提方法通过精准捕获文本结构动态特征来指导高级文本语义先验的生成,从而促进文本和图像两种模态的对齐和融合,有效提升了图像重建质量和文本可读性。 展开更多
关键词 场景文本图像超分辨率(STISR) 文本结构动态特征 多尺度定向特征 语义空间对齐 跨模态融合
原文传递
基于动态指数移动平均的半监督矿工不安全行为识别方法
20
作者 王媛彬 刘佳 +2 位作者 贺文卿 王旭 闫昭旭 《煤炭学报》 北大核心 2025年第8期4123-4134,共12页
矿工的不安全行为是影响煤矿井下安全生产的主要原因之一,对矿工不安全行为进行识别对于实现井下智能监控至关重要。目前基于深度学习的矿工不安全行为识别方法需要利用大量标注数据进行训练,数据标注消耗大量人力资源。基于半监督学习... 矿工的不安全行为是影响煤矿井下安全生产的主要原因之一,对矿工不安全行为进行识别对于实现井下智能监控至关重要。目前基于深度学习的矿工不安全行为识别方法需要利用大量标注数据进行训练,数据标注消耗大量人力资源。基于半监督学习的识别方法可以有效减少矿工图像的标注成本,但主流的半监督学习方法大多采用指数移动平均(Exponential Moving Average,EMA)对教师模型进行保守更新,使得早期教师模型学习速率较低,导致生成的伪标签质量不高,影响训练效果。为此,设计了基于动态EMA的半监督矿工不安全行为识别方法,结合指数衰减的思想,将EMA中的权重参数设置为随训练批次动态可变,以适应不同阶段的训练。同时,矿井环境昏暗模糊,难以提取矿工信息并且会加剧识别模型分类任务与定位任务的不一致,影响识别精度。针对这一问题,将高效局部注意力(Efficient Local Attention,ELA)融入特征金字塔网络中,构建高效局部注意特征金字塔模块(Efficient Local Attention Feature Pyramid Network,ELA-FPN),提高矿工信息的显著度。为解决矿工不安全行为识别任务中分类与定位不一致的问题,设计特征对齐检测头(Feature Alignment Head,FA-Head)将定位特征映射到分类特征上,提高模型对矿工行为的识别效果。试验表明:在矿工不安全行为数据集使用10%有标签数据时,研究所提算法在不增加模型复杂度的基础上对于矿工不安全行为的识别精度达到71.008%,相较于主流的Unbiased teacher v1、Unbiased teacher v2、Consistent teacher、Dense teacher和ARSL算法分别提高了5.33%、1.76%、2.08%、1.24%和0.40%,且在不同的监督比率下均优于对比算法。可以得出所提算法在矿工不安全行为识别任务上优于目前主流的半监督学习方法,在有效降低标注成本的同时具有较好的识别效果。 展开更多
关键词 半监督 矿工不安全行为 动态指数移动平均 特征对齐 高效局部注意力
在线阅读 下载PDF
上一页 1 2 19 下一页 到第
使用帮助 返回顶部