Abstract:A steel bar is an indispensable structural material in the infrastructure industry, and accurate counting of steel bars is an essential link in both the steel-bar production process and the construction site. There are some problems in steel-bar bundles, such as dense end faces, non-uniform diameter scale, end-face boundary adhesion, fusion of end face and background, and end-face occlusion. To solve the above problems, this study proposes an improved YOLOv5 model framework to reduce the missed detection rate and the false detection rate of dense small targets. Considering the scarcity of the steel-bar end face dataset, the absence of a large public dataset in this field, and the weak feature of the steel-bar end face, we built a steel-bar end face dataset with the semi-automatic labeling method for dataset labeling and the data enhancement algorithm for dataset expansion. Moreover, the backbone network in YOLOv5 was modified, and the spatial pyramid pooling (SPP) and the small target detection layer were added to obtain larger feature maps. The feature pyramid network (FPN) and path aggregation network (PAN) were used to fuse multi-scale feature images to improve the accuracy of dense small target detection. Several groups of control tests were designed based on the Data Fountain steel-bar stocktaking competition dataset and the self-built steel bar dataset. The experimental results show that the improved algorithm YOLOv5-P2 model has the best performance on the steel-bar end face detection, and the mean average precision (mAP) of the steel-bar end face reaches 99.9%. Compared with the mainstream algorithms of YOLOv3, YOLOv4, ScaledYOLOv4, and YOLOv5, the proposed model has its mAP increased by 9.6%, 7.9%, 7.0%, and 1.1%, respectively. When tested in the real environment of factories, the model has stable performance, and its detection accuracy is improved by 2.1% compared with the original model on the test dataset. The position modification of the SPP module in the backbone network of YOLOv5 and the adding of detection layers can all significantly improve the detection accuracy of dense small targets with better edge feature extraction of the steel-bar end face and an mAP of 99.9%.