###
计算机系统应用英文版:2024,33(3):126-133
本文二维码信息
码上扫一扫!
跨层协同注意和通道分组注意的细粒度图像分类
(湖南理工学院 信息科学与工程学院, 岳阳 414006)
Cross Layer Collaborative Attention and Channel Group Attention for Fine-grained Image Classification
(School of Information Science and Engineering, Hunan Institute of Science and Technology, Yueyang 414006, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 213次   下载 568
Received:September 12, 2023    Revised:October 09, 2023
中文摘要: 细粒度图像分类的主要挑战在于类间的高度相似性和类内的差异性. 现有的研究多数基于深层的特征而忽略了浅层细节信息, 然而深层的语义特征由于多次卷积和池化操作往往会丢失大量的细节信息. 为了更好地整合浅层和深层的信息, 提出了基于跨层协同注意和通道分组注意的细粒度图像分类方法. 首先, 通过ResNet50加载预训练模型作为骨干网络提取特征, 由最后3个阶段提取的特征以3个分支的形式输出, 每一个分支的特征通过跨层的方式与其余两个分支的特征计算协同注意并交互融合, 其中最后一个阶段的特征经过通道分组注意模块以增强语义特征的学习能力. 模型训练可以高效地以端到端的方式在没有边界框和注释的情况下进行训练, 实验结果表明, 该算法在3个常用细粒度图像数据集CUB-200-2011、Stanford Cars和FGVC-Aircraft上的准确率分别达到了89.5%、94.8%和94.7%.
Abstract:The main challenge of fine-grained image classification lies in the high similarity between classes and differences within classes. Most of the existing research is based on deep features and ignores shallow details. However, deep semantic features often lose a lot of details due to multiple convolution and pooling operations. To better integrate shallow and deep information, this study proposes a fine-grained image classification method based on cross-layer collaborative attention and channel grouping attention. First, the pre-trained model loaded by ResNet50 is taken as the backbone network to extract features, and the features extracted by the last three stages are output in the form of three branches. The features of each branch are calculated and coordinated with the features of the other two branches in a cross-layer manner and interactive fusion. Specifically, the features of the last stage pass through the channel grouping attention module to enhance the learning ability of semantic features. Model training can be efficiently trained in an end-to-end manner without bounding boxes and annotations. Experimental results show that the algorithm performs well on three common fine-grained image datasets CUB-200-2011, Stanford Cars, and FGVC-Aircraft. The accuracy rates reach 89.5%, 94.8%, and 94.7%, respectively.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(62271200); 湖南省高校创新平台开放基金项目(20K062); 湖南省教育厅优秀青年项目(21B0590)
引用文本:
何志祥,齐琦,何伟,郭龙源.跨层协同注意和通道分组注意的细粒度图像分类.计算机系统应用,2024,33(3):126-133
HE Zhi-Xiang,QI Qi,HE Wei,GUO Long-Yuan.Cross Layer Collaborative Attention and Channel Group Attention for Fine-grained Image Classification.COMPUTER SYSTEMS APPLICATIONS,2024,33(3):126-133