融合双注意力机制的人群计数算法

doi:10.15888/j.cnki.csa.008892

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年8月11日 3:48 星期一

首页 > 过刊浏览>2023年第32卷第1期 >241-248. DOI:10.15888/j.cnki.csa.008892

PDF HTML阅读 XML下载导出引用引用提醒

融合双注意力机制的人群计数算法
DOI:
                        10.15888/j.cnki.csa.008892
                    
CSTR:
                        
                    
作者:
                        徐晓晨徐晓晨
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找
葛艳葛艳
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找
杜军威杜军威
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找
陈卓陈卓
青岛科技大学 信息科学技术学院, 青岛 266061
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:山东省自然科学基金(ZR2021MF092)

Crowd Counting Algorithm Based on Dual Attention Mechanism

Author:

XU Xiao-Chen
XU Xiao-Chen
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找
GE Yan
GE Yan
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找
DU Jun-Wei
DU Jun-Wei
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Zhuo
CHEN Zhuo
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [21]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

针对背景复杂、遮挡、人群分布不均等人群计数常见问题, 提出了一种结合联合损失的空间-通道双注意力机制卷积神经网络模型(joint loss-based space-channel dual attention network, JL-SCDANet). 该网络前端进行图像粗粒度特征提取, 中间加入空间注意力机制以及通道注意力机制突出图像重点区域, 后端使用可加大感受野且不丢失图像分辨率的空洞卷积提取深层二维特征. 此外, 该模型结合联合损失函数进行训练, 以增强模型的鲁棒性. 为了验证模型的改进效果, 在3个公共数据集(ShanghaiTech Part B、mall和UCF_CC_50)上分别进行了对比实验, 在ShanghaiTech Part B数据集中平均绝对误差(MAE)和均方误差(MSE)分别达到了8.13和13.13; 在mall数据集中MAE、MSE达到了1.78和2.28; 在UCF_CC_50数据集中MAE、MSE分别达到了182.12和210.24, 实验结果证明了该网络在提高人数统计准确率上的有效性.

关键词:人群计数;人群密度图;卷积神经网络 (CNN);注意力机制;空洞卷积;深度学习

Abstract:

Given the common problems of crowd counting with a complex background, occlusion, and uneven crowd distribution, a joint loss-based space-channel dual attention network (JL-SCDANet) is proposed. The front end of the network extracts coarse-grained features of an image, and the spatial attention mechanism and channel attention mechanism are added in the middle to highlight the key areas of the image, while the back end uses dilated convolution that can increase the receptive field without losing the image resolution to extract deep two-dimensional features. In addition, the model is trained with the joint loss function to enhance its robustness. Comparative experiments are carried out on three public data sets (i.e., ShanghaiTech Part B, mall, and UCF_CC_50) to verify the improvement effect of the model. In terms of the mean absolute error (MAE) and mean square error (MSE), the results on ShanghaiTech Part B, mall, and UCF_CC_50 reach 8.13 and 13.13, 1.78 and 2.28, and 182.12 and 210.24, respectively. The experimental results prove the effectiveness of the network in improving the accuracy of population statistics.

Key words:crowd counting;crowd density map;convolutional neural network (CNN);attention mechanism;dilated convolution;deep learning

参考文献

[1] 蒋妮, 周海洋, 余飞鸿. 基于计算机视觉的目标计数方法综述. 激光与光电子学进展, 2021, 58(14):43-59

[2] Zhang YY, Zhou DS, Chen SQ, et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016. 589-597.

[3] 左静, 巴玉林. 基于多尺度融合的深度人群计数算法. 激光与光电子学进展, 2020, 57(24):307-315

[4] Gao JY, Wang Q, Yuan Y. SCAR:Spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing, 2019, 363:1-8.[doi:10.1016/j.neucom.2019.08.018

[5] Xu ML, Ge ZY, Jiang XH, et al. Depth information guided crowd counting for complex crowd scenes. Pattern Recognition Letters, 2019, 125:563-569.[doi:10.1016/j.patrec.2019.02.026

[6] 袁健, 王姗姗, 罗英伟. 基于图像视野划分的公共场所人群计数模型. 计算机应用研究, 2021, 38(4):1256-1260, 1280.[doi:10.19734/j.issn.1001-3695.2020.02.0076

[7] Zou ZK, Cheng Y, Qu XY, et al. Attend to count:Crowd counting with adaptive capacity multi-scale CNNs. Neurocomputing, 2019, 367:75-83.[doi:10.1016/j.neucom.2019.08.009

[8] Kong XY, Zhao MM, Zhou H, et al. Weakly supervised crowd-wise attention for robust crowd counting. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona:IEEE, 2020. 2722-2726.

[9] 杜培德, 严华. 基于多尺度空间注意力特征融合的人群计数网络. 计算机应用, 2021, 41(2):537-543.[doi:10.11772/j.issn.1001-9081.2020060793

[10] 杨旭, 黄进, 秦泽宇, 等. 基于多尺度特征融合的人群计数算法. 计算机系统应用, 2022, 31(1):226-235.[doi:10.15888/j.cnki.csa.008250

[11] 沈宁静, 袁健. 基于残差密集连接与注意力融合的人群计数算法. 电子科技, 2022, 35(6):6-12

[12] Wang FS, Sang J, Wu ZY, et al. Hybrid attention network based on progressive embedding scale-context for crowd counting. Information Sciences, 2022, 591:306-318.[doi:10.1016/j.ins.2022.01.046

[13] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.[doi:10.1109/5.726791

[14] Li YH, Zhang XF, Chen DM. CSRNet:Dilated convolutional neural networks for understanding the highly congested scenes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE, 2018. 1091-1100.

[15] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.

[16] Zhang Q, Cui ZP, Niu XG, et al. Image segmentation with pyramid dilated convolution based on ResNet and U-Net. Proceedings of the 24th International Conference on Neural Information Processing. Guangzhou:Springer, 2017. 364-372.

[17] 庄福振, 罗平, 何清, 等. 迁移学习研究进展. 软件学报, 2015, 26(1):26-39.[doi:10.13328/j.cnki.jos.004631

[18] Woo S, Park J, Lee JY, et al. CBAM:Convolutional block attention module. Proceedings of the 15th European Conference on Computer Vision. Munich:Springer, 2018. 3-19.

[19] Wu HP, Zou ZX, Gui J, et al. Multi-grained attention networks for single image super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(2):512-522.[doi:10.1109/TCSVT.2020.2988895

[20] Chen K, Loy CC, Gong SG, et al. Feature mining for localised crowd counting. Proceedings of the British Machine Vision Conference. Surrey:BMVA Press, 2012. 1-11.

[21] Idrees H, Saleemi I, Seibert C, et al. Multi-source multi-scale counting in extremely dense crowd images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland:IEEE, 2013. 2547-2554.

引用本文

徐晓晨,葛艳,杜军威,陈卓.融合双注意力机制的人群计数算法.计算机系统应用,2023,32(1):241-248

复制

文章指标

点击次数:783
下载次数: 2055
HTML阅读次数: 1645
引用次数: 0

历史

收稿日期:2022-05-16
最后修改日期:2022-06-15
录用日期:
在线发布日期: 2022-11-14
出版日期:

微信公众号

网站二维码

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码