基于多模型特征与精简注意力融合的图像分类

doi:10.15888/j.cnki.csa.008153

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月18日 2:47 星期五

首页 > 过刊浏览>2021年第30卷第11期 >210-216. DOI:10.15888/j.cnki.csa.008153

PDF HTML阅读 XML下载导出引用引用提醒

基于多模型特征与精简注意力融合的图像分类
DOI:
                        10.15888/j.cnki.csa.008153
                    
CSTR:
                        
                    
作者:
                        宋东情宋东情
华南师范大学 计算机学院, 广州 510631
在期刊界中查找
在百度中查找
在本站中查找
朱定局朱定局
华南师范大学 计算机学院, 广州 510631
在期刊界中查找
在百度中查找
在本站中查找
贺超贺超
华南师范大学 计算机学院, 广州 510631
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:中国高等教育学会专项课题(2020JXD01); 广东高校省级重点平台和重大科研项目(重大科研项目-特色创新类)(2017KTSCX048); 广东省中医药局科研项目(20191411); 广东省普通高校“人工智能”重点领域专项(2019KZDZX1027); 国家自然基金重点项目(U1811263); 广州市大数据智能教育重点实验室(201905010009); 广东省公益研究与能力建设(2018B070714018)

Image Classification Based on Multi-Model Feature and Reduced Attention Fusion

Author:

SONG Dong-Qing
SONG Dong-Qing
College of Computer Science, South China Normal University, Guangzhou 510631, China
在期刊界中查找
在百度中查找
在本站中查找
ZHU Ding-Ju
ZHU Ding-Ju
College of Computer Science, South China Normal University, Guangzhou 510631, China
在期刊界中查找
在百度中查找
在本站中查找
HE Chao
HE Chao
College of Computer Science, South China Normal University, Guangzhou 510631, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [36]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

为了提高图像分类性能, 本文提出一种多模型特征和注意力模块融合的图像分类算法(image classification algorithm based on Multi-model Feature and Reduced Attention fusion, MFRA). 通过多模型特征融合, 使网络学习输入图像不同层次的特征, 增加特征互补性, 提高特征提取能力; 通过加入注意力模块, 使网络更关注有目标的区域, 降低无关的背景干扰信息. 本文算法在Cifar-10, Cifar-100, Caltech-101这3个公开数据集上的大量实验对比, 验证了其有效性. 与现有算法对比, 本文算法的分类性能有较为明显的提升.

关键词:多模型;注意力机制;图像分类;特征融合;深度学习

Abstract:

To improve the performance of image classification, this paper proposes an image classification algorithm based on the fusion of Multi-model Feature and Reduced Attention (MFRA). Through multi-model feature fusion, the network can learn the features of different levels of input images, increase the complementarity of features and improve the ability of feature extraction. The introduction of the attention module makes the network pay more attention to the target area and reduces the irrelevant background interference information. In this paper, the effectiveness of the algorithm is verified by a large number of experimental comparisons on three public datasets, Cifar-10, Cifar-100 and Caltech-101. The classification performance of the proposed algorithm is significantly improved as compared with existing algorithms.

Key words:multi-model;attentional mechanism;image classification;feature fusion;deep learning

参考文献

[1] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110. [doi: 10.1023/B:VISI.0000029664.99615.94

[2] Dalal N, Triggs B. Histograms of oriented gradients for human detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego: IEEE, 2005. 886–893.

[3] Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32. [doi: 10.1023/A:1010933404324

[4] Mao QH, Ma HW, Zhang XH. SVM classification model parameters optimized by improved genetic algorithm. Advanced Materials Research, 2014, 889–890: 617–621

[5] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates, 2012. 1097–1105.

[6] He KM, Zhang XY, Ren SQ, et al. Identity mappings in deep residual networks. 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 630–645.

[7] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031

[8] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 770–778.

[9] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014.

[10] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015. 1–9.

[11] Sermanet P, Eigen D, Zhang X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv: 1312.6229, 2013.

[12] Zagoruyko S, Komodakis N. Wide residual networks. Proceedings of the British Machine Vision Conference (BMVC). York: BMVA Press, 2016. 1–87.

[13] Zeiler MD, Fergus R. Visualizing and understanding convolutional networks. 13th European Conference on Computer Vision. Zurich: Springer, 2014. 818–833.

[14] Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. ICML Workshop on Deep Learning for Audio, Speech and Language Processing. Atlanta: ICML, 2013.

[15] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. 2015 IEEE International Conference on Computer Vision (ICCV). Santiago: IEEE, 2015. 1026–1034.

[16] He KM, Zhang XY, Ren SQ, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. 13th European Conference on Computer Vision. Zurich: Springer, 2015. 346–361.

[17] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016. 2818–2826.

[18] Howard AG, Zhu ML, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017.

[19] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.

[20] Woo S, Park J, Lee JY, et al. CBAM: Convolutional block attention module. 15th European Conference on Computer Vision. Munich: Springer, 2018. 3–19.

[21] Park J, Woo S, Lee JY, et al. BAM: Bottleneck attention module. arXiv: 1807.06514, 2018.

[22] Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. https://arxiv.org/pdf/1603.04467.pdf. (2016-03-16) [2021-01-25].

[23] Li FF, Fergus R, Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594–611. [doi: 10.1109/TPAMI.2006.79

[24] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009, 1(4)

[25] Lin M, Chen Q, Yan SC. Network in network. arXiv: 1312.4400, 2014.

[26] Lee CY, Xie SN, Gallagher P, et al. Deeply-supervised nets. arXiv: 1409.5185, 2014.

[27] 杨萌林, 张文生. 分类激活图增强的图像分类算法. 计算机科学与探索, 2020, 14(1): 149–158. [doi: 10.3778/j.issn.1673-9418.1902025

[28] Iandola FN, Han S, Moskewicz MW, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv: 1602.07360, 2016.

[29] Larsson G, Maire M, Shakhnarovich G. FractalNet: Ultra-deep neural networks without residuals. arXiv: 1605.07648, 2016.

[30] Wu Y. Deep convolutional neural network based on densely connected squeeze-and-excitation blocks. AIP Advances, 2019, 9(6): 065016. [doi: 10.1063/1.5100577

[31] 陈鑫华, 钱雪忠, 宋威. 基于轻量级特征融合卷积网络的图像分类算法. 计算机工程: 1–10. https://doi.org/10.19678/j.issn.1000-3428.0059815. (2020-12-17) [2021-01-25].

[32] Hou S, Liu X, Wang Z. DualNet: Learn complementary features for image recognition. Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 502–510.

[33] Velickovic P, Wang D, Laney ND, et al. X-CNN: Cross-modal convolutional neural networks for sparse datasets. 2016 IEEE Symposium Series on Computational Intelligence (SSCI). Athens: IEEE, 2016. 1–8.

[34] Bo LF, Ren XF, Fox D, Multipath sparse coding using hierarchical matching pursuit. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 660–667.

[35] Chatfield K, Simonyan K, Vedaldi A, et al. Return of the devil in the details: Delving deep into convolutional nets. Proceedings of the British Machine Vision Conference. Nottingham: BMVA Press, 2014.

[36] Mahmood A, Bennamoun M, An S, et al. ResFeats: Residual network based features for image classification. 2017 IEEE International Conference on Image Processing (ICIP), Beijing. 2017. 1597–1601.

引用本文

宋东情,朱定局,贺超.基于多模型特征与精简注意力融合的图像分类.计算机系统应用,2021,30(11):210-216

复制

文章指标

点击次数:919
下载次数: 1962
HTML阅读次数: 1724
引用次数: 0

历史

收稿日期:2021-01-26
最后修改日期:2021-02-24
录用日期:
在线发布日期: 2021-10-22
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码