基于多尺度特征加权融合注意力的密集人群计数算法

doi:10.15888/j.cnki.csa.009777

AIPUB归智期刊联盟

微信公众号

网站二维码

首页 > 过刊浏览>2025年第34卷第3期 >210-219. DOI:10.15888/j.cnki.csa.009777

PDF HTML阅读 XML下载导出引用引用提醒

基于多尺度特征加权融合注意力的密集人群计数算法
DOI:
                        10.15888/j.cnki.csa.009777
                    
CSTR:
                        32024.14.csa.009777
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Dense Crowd Counting Algorithm Based on Multi-scale Feature Weighted Fusion Attention

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对人群计数面临的人头尺寸不统一、人群密度分布不均匀、背景复杂干扰等问题, 提出一种解决多尺度变化加强关注人群区域的卷积神经网络模型 (multi-scale feature weighted fusion attention convolutional neural network, MSFANet). 该网络前端采用改进的VGG-16模型对输入人群图像做第1步的粗粒度特征提取, 中间加入多尺度特征提取模块提取图像的多尺度特征信息. 随后添加注意力模块对多尺度特征进行特征加权. 后端利用锯齿状空洞卷积模块增大感受野, 以提取图像的细节特征, 生成高质量的人群密度图. 对该模型在3个公开数据集上进行实验, 结果显示, 在Shanghai Tech Part B数据集上MAE (平均绝对误差)达到7.8, MSE (均方误差)达到12.5. 在Shanghai Tech Part A数据集上MAE达到64.9, MSE达到108.4. 在UCF_CC_50数据集上MAE达到185.1, MSE达到249.8. 实验结果证实该模型有较好的准确度和鲁棒性.

Abstract:

In response to challenges faced in crowd counting, such as non-uniform head sizes, uneven crowd density distribution, and complex background interference, a convolutional neural network (CNN) model (multi-scale feature weighted fusion attention convolutional neural network, MSFANet) that focuses on crowd regions and addresses multi-scale changes is proposed. The front end of the network adopts an improved VGG-16 model to perform the first step of coarse-grained feature extraction on the input crowd image. A multi-scale feature extraction module is added in the middle to extract the multi-scale feature information of the image. Then, an attention module is added to weigh the multi-scale features. At the back end, a sawtooth shaped dilated convolution module is adopted to increase the receptive field, extract the detailed features of the image, and generate high-quality crowd density maps. Experiments on this model are conducted on three public datasets. The results show that on the Shanghai Tech Part B dataset, the mean absolute error (MAE) is reduced to 7.8, and the mean squared error (MSE) decreases to 12.5. On the Shanghai Tech Part A dataset, the MAE is reduced to 64.9, and the MSE decreases to 108.4. On the UCF_CC_50 dataset, the MAE is reduced to 185.1, and the MSE decreases to 249.8. These experimental results affirm that the proposed model exhibits strong accuracy and robustness.

参考文献

相似文献

引证文献

引用本文

时东亮,葛艳,徐慕君.基于多尺度特征加权融合注意力的密集人群计数算法.计算机系统应用,2025,34(3):210-219

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-08-06
最后修改日期:2024-08-27
录用日期:
在线发布日期: 2024-12-09
出版日期:

微信公众号

网站二维码

引用本文

分享

相关视频

文章指标

历史

文章二维码