本文已被:浏览 1532次 下载 2138次
Received:January 02, 2018 Revised:February 01, 2018
Received:January 02, 2018 Revised:February 01, 2018
中文摘要: 图像描述是机器学习和计算机视觉的重要研究领域,但现有方法对于视觉特征和模型架构之间存在的语义信息关联性探索还存在不足.本文提出了一种基于用户标签、视觉特征的注意力模型架构,能够有效地结合社交图像特征和图像中用户标签生成更加准确的描述.我们在MSCOCO数据集上进行了实验来验证算法性能,实验结果表明本文提出的基于用户标签、视觉特征的注意力模型与传统方法相比具有明显的优越性.
Abstract:Image captioning has attracted much attention in the field of machine learning and computer vision. It is not only an important practical application, but also a challenge for image understanding in the field of computer vision. Nevertheless, existing methods are simply rely on several different visual features and model architectures, the correlation between visual features and user tags has not been fully explored. This study proposes a multifaced attention model based on user tags and visual features. This model can automatically choose more significant image features or contain the user semantic information. The experiments are conducted on MSCOCO dataset, and the results show that the proposed algorithm outperforms the previous methods.
文章编号: 中图分类号: 文献标志码:
基金项目:国家科技部创新方法工作专项(2015IM010300)
引用文本:
褚晓亮,朱连章,吴春雷.基于用户注意力与视觉注意力的社交图像描述.计算机系统应用,2018,27(8):209-213
CHU Xiao-Liang,ZHU Lian-Zhang,WU Chun-Lei.Social Image Caption with Visual Attention and User Attention.COMPUTER SYSTEMS APPLICATIONS,2018,27(8):209-213
褚晓亮,朱连章,吴春雷.基于用户注意力与视觉注意力的社交图像描述.计算机系统应用,2018,27(8):209-213
CHU Xiao-Liang,ZHU Lian-Zhang,WU Chun-Lei.Social Image Caption with Visual Attention and User Attention.COMPUTER SYSTEMS APPLICATIONS,2018,27(8):209-213