Abstract:The cutting-edge technology in deep learning is applied to surface defect detection of strip steel for the accuracy improvement in surface defect detection of industrial hot-rolled strip steel. Therefore, a surface defect detection algorithm for hot-rolled strip steel is proposed, which takes Swin Transformer as the backbone feature extraction network and cascaded multi-threshold structure as the output layer. Compared with the deep learning target detection algorithm based solely on convolutional networks, the detection algorithm using the Transformer structure can achieve more accurate detection results. Specifically, first, Swin Transformer is used as the backbone feature extraction network to replace the conventional residual network structure and thus enhance the ability of the feature network to capture the deep semantic information implicit in an image. Secondly, a multi-cascade detection structure is designed, and step-by-step IoU thresholds are set to achieve the balance between detection accuracy and threshold improvement. Finally, training strategies such as soft non-maximum suppression (Soft-NMS), FP16 mixed precision training, and SGD optimizers are employed to accelerate model convergence and improve model performance. The experimental results reveal that the proposed algorithm has better detection performance on the industrial hot-rolled strip steel data set (NEU-DET) than the deep learning algorithms such as YOLOv3, YOLOF, DeformDetr, SSD512, and SSDLit. Additionally, the training speed and detection accuracy are significantly improved in the surface defect detection of crazing (Cr), inclusion (In), patches (Pa), pitted surface (PS), polled-in scales (RS), scratches (Sc), and other surface defects, and the missed detection rate is greatly reduced.