本文已被:浏览 1127次 下载 2558次
Received:January 24, 2022 Revised:February 22, 2022
Received:January 24, 2022 Revised:February 22, 2022
中文摘要: 在自动驾驶应用场景下, 将YOLOv5应用于目标检测中, 性能较之前版本有明显的提升, 但在高运行速度情况下检测精度仍不够高, 本文提出一种基于改进YOLOv5的车辆端目标检测方法. 为解决训练不同数据集时需手动设计初始锚框大小, 引入自适应锚框计算. 在主干网络(backbone)添加压缩与激励模块(squeeze and excitation, SE), 筛选针对通道的特征信息, 提升特征表达能力. 为了提升检测不同大小物体时的精度, 将注意力机制与检测网络融合, 把卷积注意力模块 (convolutional block attention module, CBAM)与Neck部分融合, 使模型在检测不同大小的物体时能关注重要的特征, 提升特征提取能力. 在主干网络中使用空间金字塔池化SPP模块, 使得模型输入可以输入任意图像高宽比和大小. 在激活函数方面, 进行卷积操作后使用Hardswish激活函数, 应用于整个网络模型. 在损失函数方面, 使用CIoU作为检测框回归的损失函数, 改善定位精度低和训练过程中目标检测框回归速度慢的问题. 实验结果表明, 改进后的检测模型在KITTI 2D数据集上测试, 目标检测的精确率(precision)提高了2.5%, 召回率(recall)提高了5.1%, 平均精度均值(mean average precision, mAP)提高了2.3%.
Abstract:In the application scenario of autonomous driving, YOLOv5 is applied to target detection, and the performance is significantly improved compared with that of previous versions. However, the detection accuracy is still low in the case of high running speed. This study proposes a vehicle-side target detection method based on improved YOLOv5. In order to address the issue of manually designing the initial anchor box size in training different datasets, an adaptive anchor box calculation is introduced. In addition, a squeeze and excitation (SE) module is added to the backbone network to screen the feature information for channels and improve the feature expression ability. In order to improve the accuracy of detecting objects of different sizes, the attention mechanism is integrated with the detection network, and the convolutional block attention module (CBAM) is integrated with the Neck part. As a result, the model can focus on important features when detecting objects of different sizes, and its ability in feature extraction is improved. The spatial pyramid pooling (SPP) module is used in the backbone network so that the model can input any image aspect ratio and size. In terms of the activation function, the Hardswish activation function is adopted for the entire network model after the convolution operation. In terms of the loss function, CIoU is used as the loss function of detection box regression to solve the problems of low positioning accuracy and slow regression of the target detection box during training. Experimental results show that the improved detection model is tested on the KITTI 2D dataset, and the precision of target detection, the recall rate, and the mean average precision (mAP) are increased by 2.5%, 5.1%, and 2.3%, respectively.
keywords: target detection YOLOv5 squeeze and excitation (SE) attention mechanism convolutional block attention module (CBAM) activation function Hardswish
文章编号: 中图分类号: 文献标志码:
基金项目:黔科合重大专项(ZNWLQC[2019]3012-1); 黔科合支撑([2021]一般297)
引用文本:
黎国溥,陈升东,王亮,邹凯,袁峰.基于改进YOLOv5的车辆端目标检测.计算机系统应用,2022,31(12):127-134
LI Guo-Pu,CHEN Sheng-Dong,WANG Liang,ZOU Kai,YUAN Feng.Vehicle-side Target Detection Based on Improved YOLOv5.COMPUTER SYSTEMS APPLICATIONS,2022,31(12):127-134
黎国溥,陈升东,王亮,邹凯,袁峰.基于改进YOLOv5的车辆端目标检测.计算机系统应用,2022,31(12):127-134
LI Guo-Pu,CHEN Sheng-Dong,WANG Liang,ZOU Kai,YUAN Feng.Vehicle-side Target Detection Based on Improved YOLOv5.COMPUTER SYSTEMS APPLICATIONS,2022,31(12):127-134