Image Captioning with Similar Temporal Attention Mechanism
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Recently, attention mechanisms have been widely used in computer vision in such aspects as the common encoder/decoder framework for image captioning. However, the current decoding framework does not clearly analyze the correlation between image features and the hidden states of the Long Short-Term Memory (LSTM) network, leading to cumulative errors. In this study, we propose a Similar Temporal Attention Network (STAN) that extends conventional attention mechanisms to strengthen the correlation between attention results and hidden states at different moments. STAN first applies attention to the hidden state and feature vector at the current moment, and then introduces the attention result of two adjacent LSTM segments into the recurrent LSTM network at the next moment through an Attention Fusion Slot (AFS) to enhance the correlation between attention results and hidden states. Also, we design a Hidden State Switch (HSS) to guide the generation of words, which is combined with the AFS to reduce cumulative errors. According to the extensive experiments on the public benchmark dataset Microsoft COCO and various evaluation mechanisms, our algorithm is superior to the baseline model and can get more competitive attention results.

    Reference
    Related
    Cited by
Get Citation

段海龙,吴春雷,王雷全.基于类时序注意力机制的图像描述方法.计算机系统应用,2021,30(7):232-238

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 01,2020
  • Revised:December 02,2020
  • Adopted:
  • Online: July 02,2021
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063