基于Yolopose的挖掘机检测与工作状态识别
作者:
基金项目:

陕西省重点研发计划(2023-YBGY-255)


Excavator Detection and Working State Discrimination Based on Yolopose
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [19]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    针对光缆、高压油气管道等地下基础设施周边容易受到挖掘机的野蛮入侵问题. 本文提出了一种结合Yolopose和多层感知机的挖掘机检测与工作状态判别方法. 首先, 设计了基于Yolopose的挖掘机6点姿势的提取网络Yolopose-ex; 其次, 利用Yolopose-ex模型提取视频中挖掘机工作姿态的变化信息, 构建了挖掘机的工作状态特征向量(MSV); 最后, 利用深度学习算法多层感知机(multilayer perceptron, MLP)分析了视频中的挖掘机的工作状态. 实验结果表明, 所提出的方法克服了复杂背景难以识别的问题, 对挖掘机工作状态识别准确率达到了96.6%, 具有较高的推理速度和泛化能力.

    Abstract:

    The surrounding areas of underground infrastructure such as optical cables and high-pressure oil and gas pipelines are vulnerable to brutal invasion by excavators. This study proposes an excavator detection and working state discrimination method combined with Yolopose and a multilayer perceptron. First, the Yolopose-ex extraction network based on Yolopose’s six-point posture of the excavator is designed. Secondly, the Yolopose-ex model is utilized to extract the change information of the excavator’s working posture in the video, and the working state feature vector (MSV) of the excavator is constructed. Finally, the multilayer perceptron (MLP) is adopted to analyze the working status of the excavator in the video. The experimental results show that the proposed method overcomes the problem of difficult discrimination of complex backgrounds, and the accuracy of the identification of the working state of the excavator reaches 96.6%, which has a high reasoning speed and generalization ability.

    参考文献
    [1] 李源, 何荣开, 王庆, 等. 基于颜色及投影特征的挖掘机图像分割算法. 小型微型计算机系统, 2013, 34(11): 2635–2638.
    [2] 林焕凯. 复杂场景下挖掘机运动状态分析与识别 [硕士学位论文]. 广州: 华南农业大学, 2016.
    [3] 毛亮, 薛月菊, 林焕凯, 等. 一种基于视频图像的挖掘机工作状态识别方法. 系统工程理论与实践, 2019, 39(3): 797–804.
    [4] Liu Z, Lin YT, Cao Y, et al. Swin Transformer: Hierarchical vision Transformer using shifted windows. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021. 9992–10002.
    [5] Mehta S, Rastegari M. MobileViT: Light-weight, general-purpose, and mobile-friendly vision transformer. Proceedings of the 10th International Conference on Learning Representations. ICLR, 2021.
    [6] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 779–788.
    [7] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv:1804.02767, 2018.
    [8] Wang CY, Bochkovskiy A, Liao HYM. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 7464–7475.
    [9] Maji D, Nagori S, Mathew M, et al. YOLO-pose: Enhancing YOLO for multi person pose estimation using object keypoint similarity loss. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 2636–2645.
    [10] Fan XC, Zheng K, Lin YW, et al. Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston: IEEE, 2015. 1347–1355.
    [11] Osokin D. Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose. Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods. Prague: ICPRAM, 2019.
    [12] Woo S, Park J, Lee J Y, et al. CBAM: Convolutional block attention module. Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018. 3–19.
    [13] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.
    [14] Howard AG, Zhu ML, Chen B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.
    [15] Sandler M, Howard A, Zhu ML, et al. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4510–4520.
    [16] Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3. Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul: IEEE, 2019. 1314–1324.
    [17] Liu S, Qi L, Qin HF, et al. Path aggregation network for instance segmentation. arXiv:1803.01534, 2018.
    [18] Ma NN, Zhang XY, Zheng HT, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design. Proceedings of the 15th European Conference on Computer Vision. Munich: Springer, 2018. 122–138.
    [19] Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 5686–5696.
    相似文献
    引证文献
引用本文

黄健,赵小飞,王虎,胡其胜.基于Yolopose的挖掘机检测与工作状态识别.计算机系统应用,2024,33(2):299-307

复制
分享
文章指标
  • 点击次数:552
  • 下载次数: 1336
  • HTML阅读次数: 1180
  • 引用次数: 0
历史
  • 收稿日期:2023-07-12
  • 最后修改日期:2023-08-11
  • 在线发布日期: 2024-01-02
  • 出版日期: 2023-02-05
文章二维码
您是第11304685位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号