感受野特征增强的SSD目标检测算法
作者:
基金项目:

国家自然科学基金面上项目(81373537);黑龙江省自然科学基金面上项目(F201434)


SSD Object Detection Algorithm with Feature Enhancement of Receptive Field
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [24]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    SSD (Single Shot multi-box Detector)算法是在不同层的特征图上,进行多尺度对象的检测,具有速度快和精度高的特点.但是,传统SSD算法的特征金字塔检测方法很难融合不同尺度的特征,并且由于底层的卷积神经网络层具有较弱的语义信息,也不利于小物体的识别,因此本论文提出了以SSD算法的网络结构为基础的一种新颖的目标检测算法RF_SSD,该算法将不同层及不同尺度的特征图以轻量级的方式相融合,下采样层生成新的特征图,通过引入感受野模块,提高网络的特征提取能力,增强特征的表征能力和鲁棒性.和传统SSD算法相比,本文算法在精度上有明显提升,同时充分保证了目标检测的实时性.实验结果表明,在PASCAL VOC测试集上测试,准确率为80.2%,检测速度为44.5 FPS.

    Abstract:

    SSD (Single Shot multi-box Detector) algorithm is used to detect multi-scale objects on feature maps of different layers, which has the characteristics of fast speed and high accuracy. However, the feature pyramid detection method of traditional SSD algorithm is difficult to fuse the features of different scales, and because the convolutional neural network layer at the bottom has weak semantic information and is not conducive to the recognition of small objects, so this paper proposes a novel object detection algorithm RF_SSD based on the network structure of SSD algorithm. In this algorithm, feature maps of different layers and scales are fused in a lightweight way, and new feature maps are generated in the lower sampling layer. By introducing the receptive field module, the feature extraction ability of the network is improved, and the characterization ability and robustness of the feature are enhanced. Compared with the traditional SSD algorithm, the accuracy of the proposed algorithm is significantly improved, and the real-time performance of object detection is fully guaranteed. The experimental results show that the accuracy is 80.2% and the detection speed is 44.5 FPS on the PASCAL VOC test set.

    参考文献
    [1] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy. 2017. 2980-2988.
    [2] Zheng YT, Pal DK, Savvides M. Ring loss: Convex feature normalization for face recognition. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 5089-5097.
    [3] Dollár P, Wojek C, Schiele B, et al. Pedestrian detection: A benchmark. Proceedings of 2009 IEEE Computer Vision and Pattern Recognition. Miami, FL, USA. 2009. 304-311.
    [4] Wang XL, Gupta A. Videos as space-time region graphs. Proceedings of the 15th European Conference on Computer Vision. Munich, Germany. 2018. 413-431.
    [5] Xie SN, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks. Proceedings of 2017 Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. 2017. 5987-5990.
    [6] Lin TY, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context. Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland. 2014. 740-755.
    [7] Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110. [doi: 10.1023/B:VISI.0000029664.99615.94
    [8] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multiBox detector. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands. 2016. 21-37.
    [9] Lin TY, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 2117-2125.
    [10] Fu CY, Liu W, Ranga A, et al. DSSD: Deconvolutional single shot detector. arXiv: 1701.06659, 2017.
    [11] Kong T, Sun FC, Yao AB, et al. Ron: Reverse connection with objectness prior networks for object detection. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 5936-5944.
    [12] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France. 2015. 448-456.
    [13] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770-778.
    [14] Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA. 2017. 4700-4708.
    [15] Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. 2016. 761-769.
    [16] Singh B, Davis LS. An analysis of scale invariance in object detection-SNIP. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 3578-3587.
    [17] Cai ZW, Fan QF, Feris RS, et al. A unified multi-scale deep convolutional neural network for fast object detection. Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands. 2016. 354-370.
    [18] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv: 1804.02767, 2018.
    [19] Lin TY, Goyal P, Girshick R, et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. [doi: 10.1109/TPAMI.2018.2858826
    [20] Zhang SF, Wen LY, Bian X, et al. Single-shot refinement neural network for object detection. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. 2018. 4203-4212.
    [21] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2014.
    [22] 王伟锋, 金杰, 陈景明. 基于感受野的快速小目标检测算法. 激光与光电子学进展, 2020, 57(2): 021501
    [23] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, CA, USA. 2017. 4278-4284.
    [24] Chen LC, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. arXiv: 1706.05587, 2017.
    相似文献
    引证文献
引用本文

谭龙,高昂.感受野特征增强的SSD目标检测算法.计算机系统应用,2020,29(9):149-155

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-11-11
  • 最后修改日期:2019-12-09
  • 在线发布日期: 2020-09-07
  • 出版日期: 2020-09-15
文章二维码
您是第11183388位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号