Abstract:Given the poor performance on the small target detection of clothing safety in video surveillance for oilfield operation, this paper proposes a standardized clothing detection method based on Cascade-YOLOv5 (C-YOLOv5), an improvement from YOLOv5. Firstly, a small target detection network cascading with YOLO-people and YOLO-dress is built to locate the pedestrian target. Then the pedestrian area is cut out and transformed in scale to detect the clothing safety of pedestrians. To fully integrate the shallow and deep feature information, this paper adopts four convolutional feature layers with different scales to predict the undetected targets. Finally, in the original image, different color frames are used to mark the types of pedestrians and their clothing parts, determining whether the pedestrians are dressed properly. Experimental results show that compared with the original YOLOv5 algorithm, the C-YOLOv5 method not only meets the real-time requirement but also improves the detection mAP by 2.3 percentage points. At the same time, the improved method of fusing deep and shallow information effectively enhances the representation ability of features and promotes the detection accuracy of small targets.