基于改进YOLOv3的人体行为检测

doi:10.15888/j.cnki.csa.007507

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月2日 17:54 星期三

首页 > 过刊浏览>2021年第30卷第6期 >197-202. DOI:10.15888/j.cnki.csa.007507

PDF HTML阅读 XML下载导出引用引用提醒

基于改进YOLOv3的人体行为检测
DOI:
                        10.15888/j.cnki.csa.007507
                    
CSTR:
                        
                    
作者:
                        李啸天李啸天
西南交通大学 电气工程学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找
黄进黄进
西南交通大学 电气工程学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找
李剑波李剑波
西南交通大学 信息科学与技术学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找
杨旭杨旭
西南交通大学 电气工程学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找
秦泽宇秦泽宇
西南交通大学 电气工程学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找
付国栋付国栋
西南交通大学 电气工程学院, 成都 611756
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:成都市科学技术局项目(2018-YF05-01424-GX)

Human Behavior Detection Based on Improved YOLOv3

Author:

LI Xiao-Tian
LI Xiao-Tian
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找
HUANG Jin
HUANG Jin
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找
LI Jian-Bo
LI Jian-Bo
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找
YANG Xu
YANG Xu
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找
QIN Ze-Yu
QIN Ze-Yu
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找
FU Guo-Dong
FU Guo-Dong
School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [25]

相似文献

引证文献

资源附件

文章评论

摘要:

针对人体行为检测中相同行为差异大, 不同行为相似度高, 以及视觉角度、遮挡、不能实时检测等问题, 提出Hierarchical Bilinear-YOLOv3人体行为检测网络. 该网络采用YOLOv3在3个不同尺度上进行预测, 抽取YOLOv3金字塔特征提取网络中特定层作为Hierarchical Bilinear的输入, 捕获特征图的层间局部特征关系, 并在3个不同尺度上进行预测, 最后将YOLOv3和Hierarchical Bilinear两种预测结果融合. 实验结果显示, 改进后的模型相比于原网络仅增加了少量参数, 在保证检测效率的同时提高原算法的检测精度, 并在一定程度上优于当前行为检测算法.

关键词:人体行为检测;YOLOv3算法;Hierarchical Bilinear-YOLOv3网络;特征提取

Abstract:

This study proposes a neural network named Hierarchical Bilinear-YOLOv3 for human behavior detection due to a large disparity in the same behavior and high resemblance between different behaviors in human behavior detection, as well as problems such as visual angle, occlusion, and incapability of continuous real-time monitoring. YOLOv3 is first designed for prediction on three scales, and certain layers in its feature pyramid networks are used as inputs for Hierarchical Bilinear to capture local feature relationships between layers in the feature maps and predict the results on three scales. The integrated results of both YOLOv3 and Hierarchical Bilinear show that the improved network only adds a few parameters compared to the original one. It improves the detection accuracy of the original algorithm without lowering the detection efficiency and thus is superior to the current behavior detection algorithms.

Key words:human behavior detection;YOLOv3 algorithm;Hierarchical Bilinear-YOLOv3 network;feature extraction

参考文献

[1] Sermanet P, Eigen D, Zhang X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv: 1312.6229, 2013.

[2] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110. [doi: 10.1023/B:VISI.0000029664.99615.94

[3] Wang XY, Han TX, Yan SC. An HOG-LBP human detector with partial occlusion handling. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Kyoto, Japan. 2009. 32–39.

[4] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA. 2001. I.

[5] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031

[6] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands. 2016. 21–37.

[7] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 779–788.

[8] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 6517–6525.

[9] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv: 1804.02767, 2018.

[10] Ji SW, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition. EEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221–231. [doi: 10.1109/TPAMI.2012.59

[11] Gkioxari G, Hariharan B, Girshick R, et al. R-CNNs for pose estimation and action detection. arXiv: 1406.5212, 2014.

[12] Gkioxari G, Girshick R, Malik J. Actions and attributes from wholes and parts. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 2470–2478.

[13] Feichtenhofer C, Pinz A, Wildes RP. Spatiotemporal residual networks for video action recognition. Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain. 2016. 3468–3476.

[14] 莫宏伟, 汪海波. 基于Faster R-CNN的人体行为检测研究. 智能系统学报, 2018, 13(6): 967–973

[15] 黄友文, 万超伦, 冯恒. 基于卷积神经网络与长短期记忆神经网络的多特征融合人体行为识别算法. 激光与光电子学进展, 2019, 56(7): 071505

[16] 朱煜, 赵江坤, 王逸宁, 等. 基于深度学习的人体行为识别算法综述. 自动化学报, 2016, 42(6): 848–857

[17] 向玉开, 孙胜利, 雷林建, 等. 基于计算机视觉的人体异常行为识别综述. 红外, 2018, 39(11): 1–6, 33. [doi: 10.3969/j.issn.1672-8785.2018.11.001

[18] Lin TY, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1449–1457.

[19] Yu CJ, Zhao XY, Zheng Q, et al. Hierarchical bilinear pooling for fine-grained visual recognition. Proceedings of the 15th European Conference on Computer Vision. Munich, Germany. 2018. 595–610.

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014.

[21] Kim JH, On KW, Lim W, et al. Hadamard product for low-rank bilinear pooling. Proceedings of the 5th International Conference on Learning Representations. Toulon, France. 2017.

[22] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1026–1034.

[23] Oquab M, Bottou L, Laptev I, et al. Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA. 2014. 1717–1724.

[24] Cimpoi M, Maji S, Vedaldi A. Deep filter banks for texture recognition and segmentation. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 3828–3836.

[25] Zhang Y, Cheng L, Wu JX, et al. Action recognition in still images with minimum annotation efforts. IEEE Transactions on Image Processing, 2016, 25(11): 5479–5490. [doi: 10.1109/TIP.2016.2605305

引用本文

李啸天,黄进,李剑波,杨旭,秦泽宇,付国栋.基于改进YOLOv3的人体行为检测.计算机系统应用,2021,30(6):197-202

复制

文章指标

点击次数:947
下载次数: 1997
HTML阅读次数: 1973
引用次数: 0

历史

收稿日期:2019-12-16
最后修改日期:2020-01-14
录用日期:
在线发布日期: 2021-06-05
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码