Improved Lightweight Masked Face Detection Based on YOLOv5
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [27]
  • |
  • Related
  • |
  • Cited by [0]
  • | |
  • Comments
    Abstract:

    To address the problems of missed detection of faces, the insufficient computing power of mobile platforms, and the limited hardware resources of face recognition applications under epidemic prevention and control, this study proposes an improved lightweight detection model for faces with masks based on YOLOv5. In this model, the C3 module in the original network is replaced with a lightweight C3Ghost module to compress the computations of the convolution process and the size of the model. Moreover, an attention mechanism is added to the backbone network to improve the feature extraction capability of the network, and the border regression loss function is improved to improve the speed and accuracy of detection. The experimental results indicate that the amount of calculation and parameters of the improved model are decreased by 29.79% and 33.33%, respectively, with the weight file size of only 2.8 M. The improved model reduces the dependence on the hardware environment, and its detection rate reaches 96.6%. Compared with the existing models, it has outstanding advantages and can be effectively applied to face recognition.

    Reference
    [1] Hasan K, Ahsan S, Abdullah-Al-Mamun, et al. Human face detection techniques: A comprehensive review and future research directions. Electronics, 2021, 10(19): 2354. [doi: 10.3390/electronics10192354
    [2] Verschae R, Ruiz-Del-Solar J, Correa M. A unified learning framework for object detection and classification using nested cascades of boosted classifiers. Machine Vision and Applications, 2008, 19(2): 85–103. [doi: 10.1007/s00138-007-0084-0
    [3] Deng B, Lv H. Survey of target detection based on neural network. Journal of Physics: Conference Series, 2021, 1952: 022055. [doi: 10.1088/1742-6596/1952/2/022055
    [4] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014. 580–587.
    [5] Girshick R. Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015. 1440–1448.
    [6] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031
    [7] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 21–37.
    [8] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 779–788.
    [9] 李泽琛, 李恒超, 胡文帅, 等. 多尺度注意力学习的Faster R-CNN口罩人脸检测模型. 西南交通大学学报, 2021, 56(5): 1002–1010. [doi: 10.3969/j.issn.0258-2724.20210017
    [10] 丁培, 阿里甫·库尔班, 耿丽婷, 等. 自然环境下实时人脸口罩检测与规范佩戴识别. 计算机工程与应用, 2021, 57(24): 268–275. [doi: 10.3778/j.issn.1002-8331.2106-0363
    [11] Liu S, Qi L, Qin HF, et al. Path aggregation network for instance segmentation. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 8759–8768.
    [12] Wang CY, Liao HYM, Wu YH, et al. CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle: IEEE, 2020. 1571–1580.
    [13] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
    [14] Lin TY, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 936–944.
    [15] Han K, Wang YH, Tian Q, et al. GhostNet: More features from cheap operations. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 1577–1586.
    [16] Chollet F. Xception: Deep learning with depthwise separable convolutions. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2016. 1800–1807.
    [17] Woo S, Park J, Lee JY, et al. CBAM: Convolutional block attention module. Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018. 3–19.
    [18] Tolstikhin IO, Houlsby N, Kolesnikov A, et al. MLP-Mixer: An all-MLP architecture for vision. Proceedings of the 35th Advances in Neural Information Processing Systems. 2021. 24261–24272.
    [19] Zhao L, Yang F, Bu LG, et al. Driver behavior detection via adaptive spatial attention mechanism. Advanced Engineering Informatics, 2021, 48: 101280. [doi: 10.1016/j.aei.2021.101280
    [20] Rezatofighi H, Tsoi N, Gwak JY, et al. Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 658–666.
    [21] Zheng ZH, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993–13000.
    [22] He JB, Erfani S, Ma XJ, et al. Alpha-IoU: A family of power intersection over union losses for bounding box regression. arXiv:2110.13675, 2021.
    [23] Yang S, Luo P, Loy CC, et al. WIDER FACE: A face detection benchmark. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 5525–5533.
    [24] Xiong RB, Yang YC, He D, et al. On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning. JMLR.org, 2020. 975.
    [25] Loshchilov I, Hutter F. SGDR: Stochastic gradient descent with warm restarts. Proceedings of the 5th International Conference on Learning Representations. Toulon: ICLR, 2017.
    [26] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv:1804.02767, 2018.
    [27] Bochkovskiy A, Wang CY, Liao HYM. YOLOv4: Optimal speed and accuracy of object detection. arXiv:2004.10934, 2020.
    Related
    Comments
    Comments
    分享到微博
    Submit
Get Citation

葛云飞,祁云嵩,孟祥宇. YOLOv5改进的轻量级口罩人脸检测.计算机系统应用,2023,32(3):195-201

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 18,2022
  • Revised:September 22,2022
  • Online: December 16,2022
Article QR Code
You are the first993503Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063