计算机系统应用  2001, Vol. 29 Issue (9): 149-155 PDF

SSD Object Detection Algorithm with Feature Enhancement of Receptive Field
TAN Long, GAO Ang
School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
Foundation item: General Program of National Natural Science Foundation of China (81373537); General Program of Natural Science Foundation of Helongjiang Province, China (F201434)
Abstract: SSD (Single Shot multi-box Detector) algorithm is used to detect multi-scale objects on feature maps of different layers, which has the characteristics of fast speed and high accuracy. However, the feature pyramid detection method of traditional SSD algorithm is difficult to fuse the features of different scales, and because the convolutional neural network layer at the bottom has weak semantic information and is not conducive to the recognition of small objects, so this paper proposes a novel object detection algorithm RF_SSD based on the network structure of SSD algorithm. In this algorithm, feature maps of different layers and scales are fused in a lightweight way, and new feature maps are generated in the lower sampling layer. By introducing the receptive field module, the feature extraction ability of the network is improved, and the characterization ability and robustness of the feature are enhanced. Compared with the traditional SSD algorithm, the accuracy of the proposed algorithm is significantly improved, and the real-time performance of object detection is fully guaranteed. The experimental results show that the accuracy is 80.2% and the detection speed is 44.5 FPS on the PASCAL VOC test set.
Key words: SSD algorithm     object detection     convolutional neural network     receptive field     computer vision

(1) 提出了新颖的、轻量级的特征融合方式, 主要是将不同层的特征图合并, 并生成特征金字塔, 降低了重复检测一个对象的多个部分或者多个对象合并到一个对象的检测概率, 同时小物体检测表现更好.

(2) 借鉴混合空洞卷积和Inception结构, 设计并添加感受野模块来增强网络的特征提取能力, 同时在不增加卷积参数的前提下增大卷积感受野, 加强轻量级卷积神经网络学到的深层特征, 保证检测器的实时性.

(3) 在PASCAL VOC数据集上进行了定性与定量的实验, 结果表明, 同传统SSD算法相比, 本文所提出的算法在目标检测性能上有显著的提升, 同时以相对低的速度损耗提高了小物体的准确率.

1 相关工作

2 RF_SSD算法

SSD采用不同尺度的特征图来检测物体, 以VGG16[21]作为骨干网络, 采用级联卷积的方式生成不同尺度的特征图, 结合YOLO的回归思想和Faster-RCNN的Anchor机制, 使用全图各个位置的多尺度区域特征进行回归, 既保证检测速度又保持了精度. 同时在对特征图预测时, 采用卷积核来预测一系列Default Bounding Boxes的类别和坐标偏移.

 图 1 SSD算法结构图

2.1 特征融合(Feature Fusion)

 图 2 特征融合模块

2.2 感受野模块

 图 3 RFM模块

2.3 算法结构

 图 4 本文的算法结构

 $L\left( {x,c,l,g} \right) = \frac{1}{N}\left( {\mathop L\nolimits_{\rm conf} \left( {x,c} \right) + \alpha \mathop L\nolimits_{\rm loc} \left( {x,l,g} \right)} \right)$ (1)

 $\mathop L\nolimits_{\rm loc} \left( {x,l,g} \right) = \sum\limits_{i \in Pos}^N {\sum\limits_{m \in \left\{ {cx,cy,w,h} \right\}} {\mathop x\nolimits_{ij}^k } } \mathop {smooth}\nolimits_{l1} \left( {\mathop l\nolimits_i^m - \mathop {\hat g}\nolimits_j^m } \right)$ (2)
 $\mathop {\hat g}\nolimits_j^{cx} = {{\left( {\mathop g\nolimits_j^{cx} - \mathop d\nolimits_i^{cx} } \right)} / {\mathop d\nolimits_i^w }}$ (3)
 $\mathop {\hat g}\nolimits_j^{cy} = {{\left( {\mathop g\nolimits_j^{cy} - \mathop d\nolimits_i^{cy} } \right)} / {\mathop d\nolimits_i^h }}$ (4)
 $\mathop {\hat g}\nolimits_j^w = \log \left( {\frac{{\mathop g\nolimits_j^w }}{{\mathop d\nolimits_i^w }}} \right)$ (5)
 $\mathop {\hat g}\nolimits_j^h = \log \left( {\frac{{\mathop g\nolimits_j^h }}{{\mathop d\nolimits_i^h }}} \right)$ (6)

 $\mathop {smooth}\nolimits_{l1} \left( x \right) = \left\{ {\begin{array}{*{20}{l}} {\mathop {0.5x}\nolimits^2, \begin{array}{*{20}{l}} {}&{{\rm if}\left| x \right| < 1} \end{array}} \\ {\left| x \right| - 0.5,\begin{array}{*{20}{l}} {}&{\rm otherwise} \end{array}} \end{array}} \right.$ (7)

 $\left\{\begin{array}{l} \mathop L\nolimits_{\rm conf} \left( {x,c} \right) = - \displaystyle \sum\limits_{i \in Pos}^N {\mathop x\nolimits_{ij}^p } \log \left( {\mathop {\hat c}\nolimits_i^p } \right) - \displaystyle \sum\limits_{i \in Neg} {\log \left( {\mathop {\hat c}\nolimits_i^0 } \right)} \\ \begin{array}{*{20}{l}} {\rm where}&{\mathop {\hat c}\nolimits_i^p } \end{array} = \frac{{\exp \left( {\mathop c\nolimits_i^p } \right)}}{{\displaystyle \sum\nolimits_p {\exp \left( {\mathop c\nolimits_i^p } \right)} }} \\ \end{array}\right.$ (8)

3 实验分析 3.1 数据增强

3.2 网络训练策略

3.3 PASCAL VOC2007测试结果分析

PASCAl VOC是一个用于物体分类识别和检测的标准数据集, 该数据集包括20个类别, 表1为PASCAl VOC具体类别.

 图 5 不同的检测算法在检测速度和精度上的分布

4 结论

 图 6 COCO 2017上的实例检测结果

