基于双路细化注意力机制的图像描述模型

doi:10.15888/j.cnki.csa.007396

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月24日 6:24 星期四

首页 > 过刊浏览>2020年第29卷第5期 >245-251. DOI:10.15888/j.cnki.csa.007396

PDF HTML阅读 XML下载导出引用引用提醒

基于双路细化注意力机制的图像描述模型
DOI:
                        10.15888/j.cnki.csa.007396
                    
CSTR:
                        
                    
作者:
                        丛璐文丛璐文
中国石油大学(华东) 计算机科学与技术学院, 青岛 266580
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Image Captioning Based on Dual Refined Attention

Author:

CONG Lu-Wen
CONG Lu-Wen
College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

图像描述是连接计算机视觉与自然语言处理两大人工智能领域内的一项重要任务.近几年来，基于注意力机制的编码器-解码器架构在图像描述领域内取得了显著的进展.然而，许多基于注意力机制的图像描述模型仅使用了单一的注意力机制.本文提出了一种基于双路细化注意力机制的图像描述模型，该模型同时使用了空间注意力机制与通道注意力机制，并且使用了细化图像特征的模块，对图像特征进行进一步细化处理，过滤掉图像中的冗余与不相关的特征.我们在MS COCO数据集上进行实验来验证本文模型的有效性，实验结果表明本文的基于双路细化注意力机制的图像描述模型与传统方法相比有显著的优越性.

关键词:图像描述;空间注意力;通道注意力;长短时记忆网络;计算机视觉

Abstract:

Image captioning is an important task, which connects computer vision and natural language processing, two major artificial intelligence fields. In recent years, encoder-decoder frameworks integrated with attention mechanism have made significant process in captioning. However, many attention-based methods only use spatial attention mechanism. In this study, we propose a novel dual refined attention model for image captioning. In the proposed model, we use not only spatial attention but also channel-wise attention and then use a refine module to refine the image features. By using the refine module, the proposed model can filter the redundant and irrelevant features in the attended image features. We validate the proposed model on MSCOCO dataset via various evaluation metrics, and the results show the effectiveness of the proposed model.

Key words:image captioning;spatial attention;channel-wise attention;Long Short Term Memory (LSTM);computer vision

引用本文

丛璐文.基于双路细化注意力机制的图像描述模型.计算机系统应用,2020,29(5):245-251

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2019-10-07
最后修改日期:2019-11-07
录用日期:
在线发布日期: 2020-05-07
出版日期: 2020-05-15

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码