期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Robust Audio-Visual Fusion for Emotion Recognition Based on Cross-Modal Learning under Noisy Conditions
1
作者 A-Seong Moon Seungyeon Jeong +3 位作者 Donghee Kim Mohd Asyraf Zulkifley Bong-Soo Sohn Jaesung Lee 《Computers, Materials & Continua》 2025年第11期2851-2872,共22页
Emotion recognition under uncontrolled and noisy environments presents persistent challenges in the design of emotionally responsive systems.The current study introduces an audio-visual recognition framework designed ... Emotion recognition under uncontrolled and noisy environments presents persistent challenges in the design of emotionally responsive systems.The current study introduces an audio-visual recognition framework designed to address performance degradation caused by environmental interference,such as background noise,overlapping speech,and visual obstructions.The proposed framework employs a structured fusion approach,combining early-stage feature-level integration with decision-level coordination guided by temporal attention mechanisms.Audio data are transformed into mel-spectrogram representations,and visual data are represented as raw frame sequences.Spatial and temporal features are extracted through convolutional and transformer-based encoders,allowing the framework to capture complementary and hierarchical information fromboth sources.Across-modal attentionmodule enables selective emphasis on relevant signals while suppressing modality-specific noise.Performance is validated on a modified version of the AFEW dataset,in which controlled noise is introduced to emulate realistic conditions.The framework achieves higher classification accuracy than comparative baselines,confirming increased robustness under conditions of cross-modal disruption.This result demonstrates the suitability of the proposed method for deployment in practical emotion-aware technologies operating outside controlled environments.The study also contributes a systematic approach to fusion design and supports further exploration in the direction of resilientmultimodal emotion analysis frameworks.The source code is publicly available at https://github.com/asmoon002/AVER(accessed on 18 August 2025). 展开更多
关键词 Multimodal learning emotion recognition cross-modal attention robust representation learning
在线阅读 下载PDF
On the generalized risk measures 被引量:1
2
作者 ZHANG Ai-li WANG Wen-yuan HU Yi-jun 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2012年第3期281-289,共9页
In this paper, new risk measures are introduced, tation results are also given. These newly introduced risk introduced by Song and Yan (2009) and Karoui (2009). and the corresponding represen- measures are extens... In this paper, new risk measures are introduced, tation results are also given. These newly introduced risk introduced by Song and Yan (2009) and Karoui (2009). and the corresponding represen- measures are extensions of those 展开更多
关键词 risk measure DISTORTION cash subadditivity robust representation.
在线阅读 下载PDF
RViT: Robust Fusion Vision Transformer with Variational Hierarchical Denoising Process for Image Classification
3
作者 Zhenghong Lin Yuze Wu +1 位作者 Jiawei Chen Shiping Wang 《Guidance, Navigation and Control》 2024年第3期191-217,共27页
Transformers designed for natural language processing have originally been explored for computer vision in recent research. Various Vision Transformers(ViTs) play an increasingly important role in the field of image t... Transformers designed for natural language processing have originally been explored for computer vision in recent research. Various Vision Transformers(ViTs) play an increasingly important role in the field of image tasks such as computer vision, multimodal fusion and multimedia analysis. However, to obtain promising performance, most existing ViTs usually rely on artificially filtered high-quality images, which may suffer from inherent noise risk.Generally, such well-constructed images are not always available in every situation. To this end,we propose a Robust ViT(RViT) to focus on the relevant and robust representation learning for image classification tasks. Specifically, we first develop a novel Denoising VTUnet module,where we conceptualize the nonrobust noise as the uncertainty under the variational conditions.Furthermore, we design a fusion transformer backbone with a tailored fusion attention mechanism to perform image classification based on the extracted robust representations effectively. To demonstrate the superiority of our model, the compared experiments are conducted on several popular datasets. Benefiting from the sequence regularity of the Transformer and captured robust feature,the proposed method exceeds compared Transformer-based models with superior performance in visual tasks. 展开更多
关键词 Image classification vision transformer robust representation learning variational inference fusion attention
在线阅读 下载PDF
Local-Tetra-Patterns for Face Recognition Encoded on Spatial Pyramid Matching
4
作者 Khuram Nawaz Khayam Zahid Mehmood +4 位作者 Hassan Nazeer Chaudhry Muhammad Usman Ashraf Usman Tariq Mohammed Nawaf Altouri Khalid Alsubhi 《Computers, Materials & Continua》 SCIE EI 2022年第3期5039-5058,共20页
Face recognition is a big challenge in the research field with a lot of problems like misalignment,illumination changes,pose variations,occlusion,and expressions.Providing a single solution to solve all these problems... Face recognition is a big challenge in the research field with a lot of problems like misalignment,illumination changes,pose variations,occlusion,and expressions.Providing a single solution to solve all these problems at a time is a challenging task.We have put some effort to provide a solution to solving all these issues by introducing a face recognition model based on local tetra patterns and spatial pyramid matching.The technique is based on a procedure where the input image is passed through an algorithm that extracts local features by using spatial pyramid matching andmax-pooling.Finally,the input image is recognized using a robust kernel representation method using extracted features.The qualitative and quantitative analysis of the proposed method is carried on benchmark image datasets.Experimental results showed that the proposed method performs better in terms of standard performance evaluation parameters as compared to state-of-the-art methods on AR,ORL,LFW,and FERET face recognition datasets. 展开更多
关键词 Face recognition local tetra patterns spatial pyramid matching robust kernel representation max-pooling
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部