单目视觉下基于逆投影空间的车辆细粒度识别
作者:
基金项目:

陕西省社会发展领域项目(2019SF-258);内蒙古自治区交通运输发展研究中心开放基金(2019KFJJ—003);陕西省交通运输厅交通科技项目(20—25K)


Fine-grained Recognition of Vehicles Based on Inverse Projection Space in Monocular Vision
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [33]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    当前车辆识别大多采用深度学习方法, 直接输入图像数据进行训练以获得车辆分类的深度网络, 由于图像本身存在透视形变及尺度变化, 因此不得不采取大量不同类型数据进行训练, 同时也无法获取车辆相关的物理信息. 为了改进上述问题, 本文提出基于逆投影空间训练的车辆细粒度识别方法. 首先利用标定信息及几何约束, 对单目投影下的车辆构建精细化的三维包络框. 然后将车辆三维包络展开, 获得规范化及标准化的逆投影空间数据. 最后利用深度卷积网络对这些展开的规范数据进行训练分类及回归, 获得5种常见车辆细分类结果及对应的物理尺寸信息. 实验结果表明, 与传统端到端的深度学习车辆分类算法相比较, 本文算法在利用更少的训练数据的前提下, 能有效的提升车辆分类准确率, 同时可获取车辆三维物理尺寸信息.

    Abstract:

    Most of the current vehicle recognition methods rely on deep learning to directly input image data for training, thus obtaining a deep network. Due to the perspective distortion and scale change of an image, a large number of different types of data have to be used for training, without obtaining the vehicle-related physical information. To address the above problems, we propose a method of vehicle fine-grained recognition based on inverse projection space. First, the three-dimensional bounding boxes are constructed for vehicles under projection of a monocular camera by calibration information and geometric constraints. Second, the bounding boxes are unfolded to obtain normalized and standardized three-dimensional data in the inverse projection space. Finally, a deep convolutional network is introduced to obtain vehicle recognition results and its corresponding physical sizes of five common types of vehicles by training these standardized data. Experimental results show that, compared with traditional end-to-end vehicle recognition methods based on deep learning, the proposed method can effectively improve the accuracy of recognition while using less training data, and the three-dimensional physical sizes of vehicles can also be obtained simultaneously.

    参考文献
    [1] Chen XZ, Kundu K, Zhang ZY, et al. Monocular 3D object detection for autonomous driving. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2147–2156. [doi: 10.1109/CVPR.2016.236]
    [2] Taghvaeeyan S, Rajamani R. Portable roadside sensors for vehicle counting, classification, and speed measurement. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(1): 73–83. [doi: 10.1109/TITS.2013.2273876
    [3] Sochor J, Juránek R, Herout A. Traffic surveillance camera calibration by 3D model bounding box alignment for accurate vehicle speed measurement. Computer Vision and Image Understanding, 2017, 161: 87–98. [doi: 10.1016/j.cviu.2017.05.015
    [4] 武非凡, 宋焕生, 戴喆, 等. 交通监控场景下的相机标定与车辆速度测量. 计算机应用研究, 2020, 37(8): 2417–2421. [doi: 10.19734/j.issn.1001-3695.2019.03.0089
    [5] Liu XC, Liu W, Mei T, et al. PROVID: Progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 2018, 20(3): 645–658. [doi: 10.1109/TMM.2017.2751966
    [6] 关济民. 高速公路联网收费稽查管理系统设计与实现[硕士学位论文]. 南京: 南京理工大学, 2018.
    [7] Duan K, Parikh D, Crandall D, et al. Discovering localized attributes for fine-grained recognition. Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012. 3474–3481. [doi: 10.1109/CVPR.2012.6248089]
    [8] Berg T, Belhumeur PN. POOF: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 955–962. [doi: 10.1109/CVPR.2013.128]
    [9] Deng J, Krause J, Li FF. Fine-grained crowdsourcing for fine-grained recognition. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 580–587. [doi: 10.1109/CVPR.2013.81]
    [10] Zhang BL. Reliable classification of vehicle types based on cascade classifier ensembles. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(1): 322–332. [doi: 10.1109/TITS.2012.2213814
    [11] Llorca DF, Colás D, Daza IG, et al. Vehicle model recognition using geometry and appearance of car emblems from rear view images. Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems. Qingdao: IEEE, 2014. 3094–3099. [doi: 10.1109/ITSC.2014.6958187]
    [12] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 779–788. [doi: 10.1109/CVPR.2016.91]
    [13] Qi CR, Liu W, Wu CX, et al. Frustum PointNets for 3D object detection from RGB-D data. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 918–927. [doi: 10.1109/CVPR.2018.00102]
    [14] Zhou Y, Tuzel O. VoxelNet: End-to-end learning for point cloud based 3D object detection. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4490–4499. [doi: 10.1109/CVPR.2018.00472]
    [15] Fei ZG, Guo JJ, Wang JD, et al. The application of laser and CCD compound measuring method on 3D object detection. Proceedings of 2010 IEEE International Conference on Mechatronics and Automation. Xi’an: IEEE, 2010. 1199–1202. [doi: 10.1109/ICMA.2010.5588049]
    [16] Zia MZ, Stark M, Schiele B, et al. Detailed 3D representations for object recognition and modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2608–2623. [doi: 10.1109/TPAMI.2013.87
    [17] Krause J, Jin HL, Yang JC, et al. Fine-grained recognition without part annotations. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 5546–5555. [doi: 10.1109/CVPR.2015.7299194]
    [18] Chabot F, Chaouch M, Rabarisoa J, et al. Deep MANTA: A coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 1827–1836. [doi: 10.1109/CVPR.2017.198]
    [19] Zhang ZX, Tan TN, Huang KQ, et al. Three-dimensional deformable-model-based localization and recognition of road vehicles. IEEE Transactions on Image Processing, 2012, 21(1): 1–13. [doi: 10.1109/TIP.2011.2160954
    [20] Corral-Soto ER, Elder JH. Slot cars: 3D modelling for improved visual traffic analytics. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017. 889–897. [doi: 10.1109/CVPRW.2017.123]
    [21] Prokaj J, Medioni G. 3-D model based vehicle recognition. Proceedings of 2009 Workshop on Applications of Computer Vision. Snowbird: IEEE, 2009. 1–7. [doi: 10.1109/WACV.2009.5403032]
    [22] Zapletal D, Herout A. Vehicle re-identification for automatic video traffic surveillance. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Las Vegas: IEEE, 2016. 25–31. [doi: 10.1109/CVPRW.2016.195]
    [23] Sochor J, Špaňhel J, Herout A. BoxCars: Improving fine-grained recognition of vehicles using 3-D bounding boxes in traffic surveillance. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(1): 97–108. [doi: 10.1109/TITS.2018.2799228
    [24] 王伟, 张朝阳, 唐心瑶, 等. 道路场景下相机自动标定及优化算法. 计算机辅助设计与图形学学报, 2019, 31(11): 1955–1962. [doi: 10.3724/SP.J.1089.2019.17737
    [25] Kanhere NK, Birchfield ST. A taxonomy and analysis of camera calibration methods for traffic monitoring applications. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(2): 441–452. [doi: 10.1109/TITS.2010.2045500
    [26] 王伟, 唐心瑶, 张朝阳, 等. 跨相机交通场景下的车辆空间定位方法. 计算机辅助设计与图形学学报, 2021, 33(6): 873–882. [doi: 10.3724/SP.J.1089.2021.18612]
    [27] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 2980–2988. [doi: 10.1109/ICCV.2017.322]
    [28] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778. [doi: 10.1109/CVPR.2016.90]
    [29] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. Proceedings of 2015 IEEE International Conference on Computer Vision. Boston: IEEE, 2015. 1–9. [doi: 10.1109/CVPR.2015.7298594]
    [30] Sochor J, Juránek R, Špaňhel J, et al. Comprehensive data set for automatic single camera visual speed measurement. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(5): 1633–1643. [doi: 10.1109/TITS.2018.2825609
    [31] Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009. 248–255. [doi: 10.1109/CVPR.2009.5206848]
    [32] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc., 2012. 1097–1105.
    [33] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2015.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王伟,唐心瑶,田尚伟,梅占涛.单目视觉下基于逆投影空间的车辆细粒度识别.计算机系统应用,2022,31(2):22-30

复制
分享
文章指标
  • 点击次数:1005
  • 下载次数: 1707
  • HTML阅读次数: 1567
  • 引用次数: 0
历史
  • 收稿日期:2020-12-04
  • 最后修改日期:2021-01-14
  • 在线发布日期: 2022-01-28
文章二维码
您是第11121916位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号