###
计算机系统应用英文版:2024,33(5):15-27
本文二维码信息
码上扫一扫!
双分支注意力与FasterNet相融合的航拍场景分类
(辽宁工程技术大学 软件学院, 葫芦岛 125105)
Aerial Scene Classification by Fusion of Dual-branch Attention and FasterNet
(Software College, Liaoning University of Engineering and Technology, Huludao 125105, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 372次   下载 1006
Received:November 30, 2023    Revised:December 29, 2023
中文摘要: 航拍高分辨率图像的场景类别多且类间相似度高, 经典的基于深度学习的分类方法, 由于在提取特征过程中会产生冗余浮点运算, 运行效率较低, FasterNet通过部分卷积提高了运行效率但会降低模型的特征提取能力, 从而降低模型的分类精度. 针对上述问题, 提出了一种融合FasterNet和注意力机制的混合结构分类方法. 首先采用“十字型卷积模块”对场景特征进行部分提取, 以提高模型运行效率. 然后采用坐标注意力与通道注意力相融合的双分支注意力机制, 以增强模型对于特征的提取能力. 最后将“十字型卷积模块”与双分支注意力模块之间进行残差连接, 使网络能训练到更多与任务相关的特征, 从而在提高分类精度的同时, 减小运行代价, 提高运行效率. 实验结果表明, 与现有基于深度学习的分类模型相比, 所提出的方法, 推理时间短而且准确率高, 参数量为19M, 平均一张图像的推理时间为7.1 ms, 在公开的数据集NWPU-RESISC45、EuroSAT、VArcGIS (10%)和VArcGIS (20%)的分类精度分别为96.12%、98.64%、95.42%和97.87%, 与FasterNet相比分别提升了2.06%、0.77%、1.34%和0.65%.
Abstract:The scenes in high-resolution aerial images are of many highly similar categories. The classic classification method based on deep learning offers low operational efficiency because of the redundant floating-point operations generated in the feature extraction process. FasterNet improves the operational efficiency through partial convolution but reduces the feature extraction ability and hence the classification accuracy of the model. To address the above problems, this study proposes a hybrid structure classification method integrating FasterNet and the attention mechanism. Specifically, the “cross-shaped convolution module” is used to partially extract scene features and thereby improve the operational efficiency of the model. Then, a dual-branch attention mechanism that integrates coordinate attention and channel attention is used to enable the model to better extract features. Finally, a residual connection is made between the “cross-shaped convolution module” and the dual-branch attention module so that more task-related features can be obtained from network training, thereby reducing operational costs and improving operational efficiency in addition to improving classification accuracy. The experimental results show that compared with the existing classification models based on deep learning, the proposed method has a short inference time and high accuracy. Its number of parameters is 19M, and its average inference time for one image is 7.1 ms. The classification accuracy of the proposed method on the public datasets NWPU-RESISC45, EuroSAT, VArcGIS (10%), and VArcGIS (20%) is 96.12%, 98.64%, 95.42%, and 97.87%, respectively, which is 2.06%, 0.77%, 1.34%, and 0.65% higher than that of the FasterNet model, respectively.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(62173171); 国家自然科学基金青年基金(41801368)
引用文本:
杨本臣,曲业田,金海波.双分支注意力与FasterNet相融合的航拍场景分类.计算机系统应用,2024,33(5):15-27
YANG Ben-Chen,QU Ye-Tian,JIN Hai-Bo.Aerial Scene Classification by Fusion of Dual-branch Attention and FasterNet.COMPUTER SYSTEMS APPLICATIONS,2024,33(5):15-27