Safety fences play an important role in power construction sites. However, violations of crossing the fence are widespread, causing great safety hazards to the construction sites. To intelligently supervise, this study proposes a Faster RCNN-based detection method for crossing fence that combines the object detection and the ideas of frame difference method. The proposed method first obtains the information of the fence location and human keypoints by object detection from the captured frames in the video and then recognizes the violations at the construction site with the frame difference method. The experiment results show that the method can effectively detect violations of crossing fences at construction sites and meet the real-time requirements.
[7] Ren SQ, He KM, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. [doi: 10.1109/TPAMI.2016.2577031
[8] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 779–788.
[9] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 2980–2988.
[10] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot MultiBox detector. 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 21–37.
[11] Lin TY, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 936–944.
[12] Lin TY, Goyal P, Girshick R, et al. Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 2999–3007.
[13] Law H, Deng J. CornerNet: Detecting objects as paired keypoints. Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018. 765–781.
[14] Zhou XY, Zhuo JC, Kr?henbühl P. Bottom-up object detection by grouping extreme and center points. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 850–859.
[15] Duan KW, Bai S, Xie LX, et al. CenterNet: Keypoint triplets for object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 6568–6577.
[16] Zhu CC, He YH, Savvides M. Feature selective anchor-free module for single-shot object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019. 840–849.
[17] Tian Z, Shen CH, Chen H, et al. FCOS: Fully convolutional one-stage object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019. 9626–9635.
[18] Kong T, Sun FC, Liu HP, et al. FoveaBox: Beyound anchor-based object detection. IEEE Transactions on Image Processing, 2020, 29: 7389–7398. [doi: 10.1109/TIP.2020.3002345
[24] Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. arXiv: 1406.2199, 2014.
[25] Wang LM, Xiong YJ, Wang Z, et al. Temporal segment networks: Towards good practices for deep action recognition. 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 20–36.
[26] Tran D, Bourdev L, Fergus R, et al. Learning spatiotemporal features with 3D convolutional networks. Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015. 4489–4497.
[27] Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet?. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 6546–6555.
[28] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
[29] Carreira J, Zisserman A. Quo vadis, action recognition? A new model and the kinetics dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 4724–4733.