基于多头图注意力网络与图模型的多标签图像分类

doi:10.15888/j.cnki.csa.009148

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月24日 5:49 星期四

首页 > 过刊浏览>2023年第32卷第6期 >286-292. DOI:10.15888/j.cnki.csa.009148

PDF HTML阅读 XML下载导出引用引用提醒

基于多头图注意力网络与图模型的多标签图像分类
DOI:
                        10.15888/j.cnki.csa.009148
                    
CSTR:
                        
                    
作者:
                        石琇赟石琇赟
山西大学 数学科学学院, 太原 030006
在期刊界中查找
在百度中查找
在本站中查找
李顺勇李顺勇
山西大学 数学科学学院, 太原 030006
在期刊界中查找
在百度中查找
在本站中查找
韩翔韩翔
山西大学 数学科学学院, 太原 030006
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(82274360, 61976128); 2022年度山西省研究生教育教学改革课题(2022YJJG010); 山西省高等学校教学改革创新项目(J2021059); 高等学校大学数学教学研究与发展中心项目(CMC20210315)

Multi-label Image Classification Based on Multi-head Graph Attention Network and Graph Model

Author:

SHI Xiu-Yun
SHI Xiu-Yun
School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China
在期刊界中查找
在百度中查找
在本站中查找
LI Shun-Yong
LI Shun-Yong
School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China
在期刊界中查找
在百度中查找
在本站中查找
HAN Xiang
HAN Xiang
School of Mathematical Sciences, Shanxi University, Taiyuan 030006, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

多标签图像分类是多标签数据分类问题中的研究热点. 针对目前多标签图像分类方法只学习图像的视觉表示特征, 忽略了图像标签之间的相关信息以及标签语义与图像特征的对应关系等问题, 提出了一种基于多头图注意力网络与图模型的多标签图像分类模型(ML-M-GAT). 该模型利用标签共现关系与标签属性信息构建图模型, 使用多头注意力机制学习标签的注意力权重, 并利用标签权重将标签语义特征与图像特征进行融合, 从而将标签相关性与标签语义信息融入到多标签图像分类模型中. 为验证本文所提模型的有效性, 在公开数据集VOC-2007和COCO-2014上进行实验, 实验结果表明, ML-M-GAT模型在两个数据集上的平均均值精度(mAP)分别为94%和82.2%, 均优于 CNN-RNN、ResNet101、MLIR、MIC-FLC 模型, 比ResNet101模型分别提高了4.2%和3.9%. 因此, 本文所提的ML-M-GAT模型能够利用图像标签信息提高多标签图像分类性能.

关键词:图像分类;残差神经网络;多头注意力;图模型

Abstract:

Multi-label image classification is a research hotspot in multi-label data classification. The existing multi-label image classification methods only learn the visual representation features of images and ignore the relevant information between image labels and the correspondence between label semantics and image features. In order to solve these problems, a multi-label image classification model based on a multi-head graph attention network and graph model (ML-M-GAT) is proposed. By using label co-occurrence and attribute information, the model builds a graph model, and it employs the multi-head attention mechanism to learn the attention weight of the label. In addition, the model utilizes label weights to fuse label semantic features and image features, so as to integrate label correlation and label semantic information into the multi-label image classification model. In order to verify the effectiveness of the proposed model, experiments are carried out on the public datasets VOC-2007 and COCO-2014, and the experimental results show that the average mean accuracy (mAP) of the ML-M-GAT model on the two datasets is 94% and 82.2%, respectively, which are better than that of CNN-RNN, ResNet101, MLIR, and MIC-FLC models and are 4.2% and 3.9% higher than that of ResNet101 models, respectively. Therefore, the proposed model can improve the performance of multi-label image classification by using image label information.

Key words:image classification;residual neural network (RNN);multi-head attention;graph model

引用本文

石琇赟,李顺勇,韩翔.基于多头图注意力网络与图模型的多标签图像分类.计算机系统应用,2023,32(6):286-292

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-12-06
最后修改日期:2023-01-17
录用日期:
在线发布日期: 2023-04-25
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码