基于注意力机制的弱监督细粒度图像分类
作者:
基金项目:

国家科技部重点研发计划(2018YFB1004901); 浙江省技术厅重点项目(2019C25014); 浙江省基金 (LY17C090011)


Weakly Supervised Fine-Grained Image Classification Based on Attention Mechanism
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [26]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    针对细粒度图像分类任务中难以对图中具有鉴别性对象进行有效学习的问题, 本文提出了一种基于注意力机制的弱监督细粒度图像分类算法. 该算法能有效定位和识别细粒度图像中语义敏感特征. 首先在经典卷积神经网络的基础上通过线性融合特征得到对象整体信息的表达, 然后通过视觉注意力机制进一步提取特征中具有鉴别性的细节部分, 获得更完善的细粒度特征表达. 所提算法实现了线性融合和注意力机制的结合, 可看作是多网络分支合作训练共同优化的网络模型, 从而让网络模型对整体信息和局部信息都有更好的表达能力. 在3个公开可用的细粒度识别数据集上进行了验证, 实验结果表明, 所提方法有效性均优于基线方法, 且达到了目前先进的分类水平.

    Abstract:

    Fine-grained image classification is challenging due to the difficulty in the effective learning of discriminative objects in images. Therefore, this study proposes a weakly supervised fine-grained image classification algorithm based on the attention mechanism. This algorithm can accurately locate and identify the semantically sensitive features in fine-grained images. First, on the basis of the classic convolutional neural network, the overall information of an object can be expressed by the linear fusion of features. Then, the discriminative details of the features are further extracted through the visual attention mechanism to obtain a more complete fine-grained feature expression. The proposed algorithm combines linear fusion with the attention mechanism and it can be regarded as a network model of multi-network-branch cooperative training and joint optimization. Thus, the network model can better express the overall and local information. Experiments on three publicly available fine-grained identification datasets show that the proposed method is superior to the baseline method and achieves the advanced classification level.

    参考文献
    [1] Branson S, Van Horn G, Belongie S, et al. Bird species categorization using pose normalized deep convolutional nets. arXiv: 1406.2952, 2014.
    [2] Zhang XP, Xiong HK, Zhou WG, et al. Picking deep filter responses for fine-grained image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 1134–1142.
    [3] Nilsback ME, Zisserman A. A visual vocabulary for flower classification. Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2006. 1447–1454.
    [4] Reed S, Akata Z, Lee H, et al. Learning deep representations of fine-grained visual descriptions. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 49–58.
    [5] Khosla A, Jayadevaprakash N, Yao BP, et al. Novel dataset for fine-grained image categorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, IEEE. 2011.
    [6] Krause J, Jin HL, Yang JC, et al. Fine-grained recognition without part annotations. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 5546–5555.
    [7] Zhao B, Wu X, Feng JS, et al. Diversified visual attention networks for fine-grained object classification. IEEE Transactions on Multimedia, 2017, 19(6): 1245–1256. [doi: 10.1109/TMM.2017.2648498
    [8] 李彦冬, 郝宗波, 雷航. 卷积神经网络研究综述. 计算机应用, 2016, 36(9): 2508–2515, 2565. [doi: 10.11772/j.issn.1001-9081.2016.09.2508
    [9] 李旭冬, 叶茂, 李涛. 基于卷积神经网络的目标检测研究综述. 计算机应用研究, 2017, 34(10): 2881–2886, 2891. [doi: 10.3969/j.issn.1001-3695.2017.10.001
    [10] 罗建豪, 吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述. 自动化学报, 2017, 43(8): 1306–1318
    [11] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
    [12] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2818–2826.
    [13] Zhang N, Donahue J, Girshick R, et al. Part-based R-CNNs for fine-grained category detection. Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014. 834–849.
    [14] Zhang H, Xu T, Elhoseiny M, et al. SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 1143–1152.
    [15] Lin TY, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015. 1449–1457.
    [16] 葛疏雨, 高子淋, 张冰冰, 等. 基于核化双线性卷积网络的细粒度图像分类. 电子学报, 2019, 47(10): 2134–2141. [doi: 10.3969/j.issn.0372-2112.2019.10.015
    [17] Yang Z, Luo TG, Wang D, et al. Learning to navigate for fine-grained classification. Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018. 420–435.
    [18] Wang YM, Morariu VI, Davis LS. Learning a discriminative filter bank within a CNN for fine-grained recognition. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4148–4157.
    [19] Fu JL, Zheng HL, Mei T. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 4476–4484.
    [20] Zheng HL, Fu JL, Mei T, et al. Learning multi-attention convolutional neural network for fine-grained image recognition. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017.5219–5227.
    [21] Ji JS, Jiang LF, Zhang T, et al. Adversarial erasing attention for fine-grained image classification. Multimedia Tools and Applications, 2020, (9): 1–23.
    [22] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2015.
    [23] Yan YC, Ni BB, Wei HW, et al. Fine-grained image analysis via progressive feature learning. Neurocomputing, 2020, 396: 254–265. [doi: 10.1016/j.neucom.2018.07.100
    [24] Hu J, Shen L, Albanie S, et al. Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011–2023. [doi: 10.1109/TPAMI.2019.2913372
    [25] Li X, Wang WH, Hu XL, et al. Selective kernel networks. Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2020. 510–519.
    [26] Liang L, Cao JD, Li XY, et al. Improvement of residual attention network for image classification. In: Cui Z, Pan JS, Zhang SS, et al. eds. Intelligence Science and Big Data Engineering. Visual Data Engineering. Cham: Springer, 2019. 529–539.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

李文书,王志骁,李绅皓,赵朋.基于注意力机制的弱监督细粒度图像分类.计算机系统应用,2021,30(10):232-239

复制
分享
文章指标
  • 点击次数:1261
  • 下载次数: 2429
  • HTML阅读次数: 3320
  • 引用次数: 0
历史
  • 收稿日期:2020-12-31
  • 最后修改日期:2021-01-29
  • 在线发布日期: 2021-10-08
文章二维码
您是第11460142位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号