Abstract:In the application scenario of autonomous driving, YOLOv5 is applied to target detection, and the performance is significantly improved compared with that of previous versions. However, the detection accuracy is still low in the case of high running speed. This study proposes a vehicle-side target detection method based on improved YOLOv5. In order to address the issue of manually designing the initial anchor box size in training different datasets, an adaptive anchor box calculation is introduced. In addition, a squeeze and excitation (SE) module is added to the backbone network to screen the feature information for channels and improve the feature expression ability. In order to improve the accuracy of detecting objects of different sizes, the attention mechanism is integrated with the detection network, and the convolutional block attention module (CBAM) is integrated with the Neck part. As a result, the model can focus on important features when detecting objects of different sizes, and its ability in feature extraction is improved. The spatial pyramid pooling (SPP) module is used in the backbone network so that the model can input any image aspect ratio and size. In terms of the activation function, the Hardswish activation function is adopted for the entire network model after the convolution operation. In terms of the loss function, CIoU is used as the loss function of detection box regression to solve the problems of low positioning accuracy and slow regression of the target detection box during training. Experimental results show that the improved detection model is tested on the KITTI 2D dataset, and the precision of target detection, the recall rate, and the mean average precision (mAP) are increased by 2.5%, 5.1%, and 2.3%, respectively.