期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
A Novelty Framework in Image-Captioning with Visual Attention-Based Refined Visual Features
1
作者 Alaa Thobhani Beiji Zou +4 位作者 Xiaoyan Kui Amr Abdussalam Muhammad Asim Mohammed ELAffendi Sajid Shah 《Computers, Materials & Continua》 2025年第3期3943-3964,共22页
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do ... Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance. 展开更多
关键词 image-captioning visual attention deep learning visual features
在线阅读 下载PDF
Optimizing Sentiment Integration in Image Captioning Using Transformer-Based Fusion Strategies
2
作者 Komal Rani Narejo Hongying Zan +4 位作者 Kheem Parkash Dharmani Orken Mamyrbayev Ainur Akhmediyarova Zhibek Alibiyeva Janna Alimkulova 《Computers, Materials & Continua》 2025年第8期3407-3429,共23页
While automatic image captioning systems have made notable progress in the past few years,generating captions that fully convey sentiment remains a considerable challenge.Although existing models achieve strong perfor... While automatic image captioning systems have made notable progress in the past few years,generating captions that fully convey sentiment remains a considerable challenge.Although existing models achieve strong performance in visual recognition and factual description,they often fail to account for the emotional context that is naturally present in human-generated captions.To address this gap,we propose the Sentiment-Driven Caption Generator(SDCG),which combines transformer-based visual and textual processing withmulti-level fusion.RoBERTa is used for extracting sentiment from textual input,while visual features are handled by the Vision Transformer(ViT).These features are fused using several fusion approaches,including Concatenation,Attention,Visual-Sentiment Co-Attention(VSCA),and Cross-Attention.Our experiments demonstrate that SDCG significantly outperforms baseline models such as the Generalized Image Transformer(GIT),which achieves 82.01%,and Bootstrapping Language-Image Pre-training(BLIP),which achieves 83.07%,in sentiment accuracy.While SDCG achieves 94.52%sentiment accuracy and improves scores in BLEU and ROUGE-L,the model demonstrates clear advantages.More importantly,the captions aremore natural,as they incorporate emotional cues and contextual awareness,making them resemble those written by a human. 展开更多
关键词 image-captioning sentiment analysis deep learning fusion methods
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部