基于改进YOLOv3的人体行为检测
作者:
基金项目:

成都市科学技术局项目(2018-YF05-01424-GX)


Human Behavior Detection Based on Improved YOLOv3
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对人体行为检测中相同行为差异大, 不同行为相似度高, 以及视觉角度、遮挡、不能实时检测等问题, 提出Hierarchical Bilinear-YOLOv3人体行为检测网络. 该网络采用YOLOv3在3个不同尺度上进行预测, 抽取YOLOv3金字塔特征提取网络中特定层作为Hierarchical Bilinear的输入, 捕获特征图的层间局部特征关系, 并在3个不同尺度上进行预测, 最后将YOLOv3和Hierarchical Bilinear两种预测结果融合. 实验结果显示, 改进后的模型相比于原网络仅增加了少量参数, 在保证检测效率的同时提高原算法的检测精度, 并在一定程度上优于当前行为检测算法.

    Abstract:

    This study proposes a neural network named Hierarchical Bilinear-YOLOv3 for human behavior detection due to a large disparity in the same behavior and high resemblance between different behaviors in human behavior detection, as well as problems such as visual angle, occlusion, and incapability of continuous real-time monitoring. YOLOv3 is first designed for prediction on three scales, and certain layers in its feature pyramid networks are used as inputs for Hierarchical Bilinear to capture local feature relationships between layers in the feature maps and predict the results on three scales. The integrated results of both YOLOv3 and Hierarchical Bilinear show that the improved network only adds a few parameters compared to the original one. It improves the detection accuracy of the original algorithm without lowering the detection efficiency and thus is superior to the current behavior detection algorithms.

    参考文献
    [1] Sermanet P, Eigen D, Zhang X, et al. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv: 1312.6229, 2013.
    [2] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110. [doi: 10.1023/B:VISI.0000029664.99615.94
    [3] Wang XY, Han TX, Yan SC. An HOG-LBP human detector with partial occlusion handling. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Kyoto, Japan. 2009. 32–39.
    [4] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. Proceedings of 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA. 2001. I.
    [5] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031
    [6] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands. 2016. 21–37.
    [7] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 779–788.
    [8] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 6517–6525.
    [9] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv: 1804.02767, 2018.
    [10] Ji SW, Xu W, Yang M, et al. 3D convolutional neural networks for human action recognition. EEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221–231. [doi: 10.1109/TPAMI.2012.59
    [11] Gkioxari G, Hariharan B, Girshick R, et al. R-CNNs for pose estimation and action detection. arXiv: 1406.5212, 2014.
    [12] Gkioxari G, Girshick R, Malik J. Actions and attributes from wholes and parts. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 2470–2478.
    [13] Feichtenhofer C, Pinz A, Wildes RP. Spatiotemporal residual networks for video action recognition. Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain. 2016. 3468–3476.
    [14] 莫宏伟, 汪海波. 基于Faster R-CNN的人体行为检测研究. 智能系统学报, 2018, 13(6): 967–973
    [15] 黄友文, 万超伦, 冯恒. 基于卷积神经网络与长短期记忆神经网络的多特征融合人体行为识别算法. 激光与光电子学进展, 2019, 56(7): 071505
    [16] 朱煜, 赵江坤, 王逸宁, 等. 基于深度学习的人体行为识别算法综述. 自动化学报, 2016, 42(6): 848–857
    [17] 向玉开, 孙胜利, 雷林建, 等. 基于计算机视觉的人体异常行为识别综述. 红外, 2018, 39(11): 1–6, 33. [doi: 10.3969/j.issn.1672-8785.2018.11.001
    [18] Lin TY, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1449–1457.
    [19] Yu CJ, Zhao XY, Zheng Q, et al. Hierarchical bilinear pooling for fine-grained visual recognition. Proceedings of the 15th European Conference on Computer Vision. Munich, Germany. 2018. 595–610.
    [20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014.
    [21] Kim JH, On KW, Lim W, et al. Hadamard product for low-rank bilinear pooling. Proceedings of the 5th International Conference on Learning Representations. Toulon, France. 2017.
    [22] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile. 2015. 1026–1034.
    [23] Oquab M, Bottou L, Laptev I, et al. Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA. 2014. 1717–1724.
    [24] Cimpoi M, Maji S, Vedaldi A. Deep filter banks for texture recognition and segmentation. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 3828–3836.
    [25] Zhang Y, Cheng L, Wu JX, et al. Action recognition in still images with minimum annotation efforts. IEEE Transactions on Image Processing, 2016, 25(11): 5479–5490. [doi: 10.1109/TIP.2016.2605305
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

李啸天,黄进,李剑波,杨旭,秦泽宇,付国栋.基于改进YOLOv3的人体行为检测.计算机系统应用,2021,30(6):197-202

复制
分享
文章指标
  • 点击次数:947
  • 下载次数: 1997
  • HTML阅读次数: 1973
  • 引用次数: 0
历史
  • 收稿日期:2019-12-16
  • 最后修改日期:2020-01-14
  • 在线发布日期: 2021-06-05
文章二维码
您是第11185678位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号