基于多模型特征与精简注意力融合的图像分类
作者:
基金项目:

中国高等教育学会专项课题(2020JXD01); 广东高校省级重点平台和重大科研项目(重大科研项目-特色创新类)(2017KTSCX048); 广东省中医药局科研项目(20191411); 广东省普通高校“人工智能”重点领域专项(2019KZDZX1027); 国家自然基金重点项目(U1811263); 广州市大数据智能教育重点实验室(201905010009); 广东省公益研究与能力建设(2018B070714018)


Image Classification Based on Multi-Model Feature and Reduced Attention Fusion
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    为了提高图像分类性能, 本文提出一种多模型特征和注意力模块融合的图像分类算法(image classification algorithm based on Multi-model Feature and Reduced Attention fusion, MFRA). 通过多模型特征融合, 使网络学习输入图像不同层次的特征, 增加特征互补性, 提高特征提取能力; 通过加入注意力模块, 使网络更关注有目标的区域, 降低无关的背景干扰信息. 本文算法在Cifar-10, Cifar-100, Caltech-101这3个公开数据集上的大量实验对比, 验证了其有效性. 与现有算法对比, 本文算法的分类性能有较为明显的提升.

    Abstract:

    To improve the performance of image classification, this paper proposes an image classification algorithm based on the fusion of Multi-model Feature and Reduced Attention (MFRA). Through multi-model feature fusion, the network can learn the features of different levels of input images, increase the complementarity of features and improve the ability of feature extraction. The introduction of the attention module makes the network pay more attention to the target area and reduces the irrelevant background interference information. In this paper, the effectiveness of the algorithm is verified by a large number of experimental comparisons on three public datasets, Cifar-10, Cifar-100 and Caltech-101. The classification performance of the proposed algorithm is significantly improved as compared with existing algorithms.

    参考文献
    [1] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110. [doi: 10.1023/B:VISI.0000029664.99615.94
    [2] Dalal N, Triggs B. Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego: IEEE, 2005. 886–893.
    [3] Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32. [doi: 10.1023/A:1010933404324
    [4] Mao QH, Ma HW, Zhang XH. SVM classification model parameters optimized by improved genetic algorithm. Advanced Materials Research, 2014, 889–890: 617–621
    [5] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates, 2012. 1097–1105.
    [6] He KM, Zhang XY, Ren SQ, et al. Identity mappings in deep residual networks. 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 630–645.
    [7] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031
    [8] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778.
    [9] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014.
    [10] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015. 1–9.
    [11] Sermanet P, Eigen D, Zhang X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv: 1312.6229, 2013.
    [12] Zagoruyko S, Komodakis N. Wide residual networks. Proceedings of the British Machine Vision Conference (BMVC). York: BMVA Press, 2016. 1–87.
    [13] Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. 13th European Conference on Computer Vision. Zurich: Springer, 2014. 818–833.
    [14] Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. ICML Workshop on Deep Learning for Audio, Speech and Language Processing. Atlanta: ICML, 2013.
    [15] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. 2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015. 1026–1034.
    [16] He KM, Zhang XY, Ren SQ, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. 13th European Conference on Computer Vision. Zurich: Springer, 2015. 346–361.
    [17] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 2818–2826.
    [18] Howard AG, Zhu ML, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017.
    [19] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.
    [20] Woo S, Park J, Lee JY, et al. CBAM: Convolutional block attention module. 15th European Conference on Computer Vision. Munich: Springer, 2018. 3–19.
    [21] Park J, Woo S, Lee JY, et al. BAM: Bottleneck attention module. arXiv: 1807.06514, 2018.
    [22] Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. https://arxiv.org/pdf/1603.04467.pdf. (2016-03-16) [2021-01-25].
    [23] Li FF, Fergus R, Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594–611. [doi: 10.1109/TPAMI.2006.79
    [24] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009, 1(4)
    [25] Lin M, Chen Q, Yan SC. Network in network. arXiv: 1312.4400, 2014.
    [26] Lee CY, Xie SN, Gallagher P, et al. Deeply-supervised nets. arXiv: 1409.5185, 2014.
    [27] 杨萌林, 张文生. 分类激活图增强的图像分类算法. 计算机科学与探索, 2020, 14(1): 149–158. [doi: 10.3778/j.issn.1673-9418.1902025
    [28] Iandola FN, Han S, Moskewicz MW, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv: 1602.07360, 2016.
    [29] Larsson G, Maire M, Shakhnarovich G. FractalNet: Ultra-deep neural networks without residuals. arXiv: 1605.07648, 2016.
    [30] Wu Y. Deep convolutional neural network based on densely connected squeeze-and-excitation blocks. AIP Advances, 2019, 9(6): 065016. [doi: 10.1063/1.5100577
    [31] 陈鑫华, 钱雪忠, 宋威. 基于轻量级特征融合卷积网络的图像分类算法. 计算机工程: 1–10. https://doi.org/10.19678/j.issn.1000-3428.0059815. (2020-12-17) [2021-01-25].
    [32] Hou S, Liu X, Wang Z. DualNet: Learn complementary features for image recognition. Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 502–510.
    [33] Velickovic P, Wang D, Laney ND, et al. X-CNN: Cross-modal convolutional neural networks for sparse datasets. 2016 IEEE Symposium Series on Computational Intelligence (SSCI). Athens: IEEE, 2016. 1–8.
    [34] Bo LF, Ren XF, Fox D, Multipath sparse coding using hierarchical matching pursuit. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 660–667.
    [35] Chatfield K, Simonyan K, Vedaldi A, et al. Return of the devil in the details: Delving deep into convolutional nets. Proceedings of the British Machine Vision Conference. Nottingham: BMVA Press, 2014.
    [36] Mahmood A, Bennamoun M, An S, et al. ResFeats: Residual network based features for image classification. 2017 IEEE International Conference on Image Processing (ICIP), Beijing. 2017. 1597–1601.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

宋东情,朱定局,贺超.基于多模型特征与精简注意力融合的图像分类.计算机系统应用,2021,30(11):210-216

复制
分享
文章指标
  • 点击次数:919
  • 下载次数: 1962
  • HTML阅读次数: 1724
  • 引用次数: 0
历史
  • 收稿日期:2021-01-26
  • 最后修改日期:2021-02-24
  • 在线发布日期: 2021-10-22
文章二维码
您是第11371949位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号