期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
A Position-Aware Transformer for Image Captioning 被引量:3
1
作者 Zelin Deng Bo Zhou +3 位作者 Pei He Jianfeng Huang Osama Alfarraj Amr Tolba 《Computers, Materials & Continua》 SCIE EI 2022年第1期2065-2081,共17页
Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short Ter... Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short TermMemory(LSTM)are used to translate an image into a natural language description.Among these approaches,the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.However,most conventional visual attention mechanisms are based on high-level image features,ignoring the effects of other image features,and giving insufficient consideration to the relative positions between image features.In this work,we propose a Position-Aware Transformer model with image-feature attention and position-aware attention mechanisms for the above problems.The image-feature attention firstly extracts multi-level features by using Feature Pyramid Network(FPN),then utilizes the scaled-dot-product to fuse these features,which enables our model to detect objects of different scales in the image more effectivelywithout increasing parameters.In the position-aware attentionmechanism,the relative positions between image features are obtained at first,afterwards the relative positions are incorporated into the original image features to generate captions more accurately.Experiments are carried out on the MSCOCO dataset and our approach achieves competitive BLEU-4,METEOR,ROUGE-L,CIDEr scores compared with some state-of-the-art approaches,demonstrating the effectiveness of our approach. 展开更多
关键词 Deep learning image captioning TRANSFORMER ATTENTION position-aware
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部