单目视觉下基于逆投影空间的车辆细粒度识别

doi:10.15888/j.cnki.csa.008244

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月1日 8:22 星期二

首页 > 过刊浏览>2022年第31卷第2期 >22-30. DOI:10.15888/j.cnki.csa.008244

PDF HTML阅读 XML下载导出引用引用提醒

单目视觉下基于逆投影空间的车辆细粒度识别
DOI:
                        10.15888/j.cnki.csa.008244
                    
CSTR:
                        
                    
作者:
                        王伟王伟
长安大学 信息工程学院, 西安 710064;安徽科力信息产业有限责任公司, 合肥 230088
在期刊界中查找
在百度中查找
在本站中查找
唐心瑶唐心瑶
长安大学 信息工程学院, 西安 710064
在期刊界中查找
在百度中查找
在本站中查找
田尚伟田尚伟
长安大学 信息工程学院, 西安 710064
在期刊界中查找
在百度中查找
在本站中查找
梅占涛梅占涛
内蒙古第一机械集团股份有限公司, 包头 014030
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:陕西省社会发展领域项目（2019SF-258）；内蒙古自治区交通运输发展研究中心开放基金（2019KFJJ—003）；陕西省交通运输厅交通科技项目（20—25K）

Fine-grained Recognition of Vehicles Based on Inverse Projection Space in Monocular Vision

Author:

WANG Wei
WANG Wei
School of Information Engineering, Chang’an University, Xi’an 710064, China;Anhui Keli Information Industry Co. Ltd., Hefei 230088, China
在期刊界中查找
在百度中查找
在本站中查找
TANG Xin-Yao
TANG Xin-Yao
School of Information Engineering, Chang’an University, Xi’an 710064, China
在期刊界中查找
在百度中查找
在本站中查找
TIAN Shang-Wei
TIAN Shang-Wei
School of Information Engineering, Chang’an University, Xi’an 710064, China
在期刊界中查找
在百度中查找
在本站中查找
MEI Zhan-Tao
MEI Zhan-Tao
Inner Mongolia First Machinery Group Co. Ltd., Baotou 014030, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [33]

相似文献

引证文献

资源附件

文章评论

摘要:

当前车辆识别大多采用深度学习方法, 直接输入图像数据进行训练以获得车辆分类的深度网络, 由于图像本身存在透视形变及尺度变化, 因此不得不采取大量不同类型数据进行训练, 同时也无法获取车辆相关的物理信息. 为了改进上述问题, 本文提出基于逆投影空间训练的车辆细粒度识别方法. 首先利用标定信息及几何约束, 对单目投影下的车辆构建精细化的三维包络框. 然后将车辆三维包络展开, 获得规范化及标准化的逆投影空间数据. 最后利用深度卷积网络对这些展开的规范数据进行训练分类及回归, 获得5种常见车辆细分类结果及对应的物理尺寸信息. 实验结果表明, 与传统端到端的深度学习车辆分类算法相比较, 本文算法在利用更少的训练数据的前提下, 能有效的提升车辆分类准确率, 同时可获取车辆三维物理尺寸信息.

关键词:深度学习;智能交通;三维包络框;三维空间标准化数据;车辆细粒度识别

Abstract:

Most of the current vehicle recognition methods rely on deep learning to directly input image data for training, thus obtaining a deep network. Due to the perspective distortion and scale change of an image, a large number of different types of data have to be used for training, without obtaining the vehicle-related physical information. To address the above problems, we propose a method of vehicle fine-grained recognition based on inverse projection space. First, the three-dimensional bounding boxes are constructed for vehicles under projection of a monocular camera by calibration information and geometric constraints. Second, the bounding boxes are unfolded to obtain normalized and standardized three-dimensional data in the inverse projection space. Finally, a deep convolutional network is introduced to obtain vehicle recognition results and its corresponding physical sizes of five common types of vehicles by training these standardized data. Experimental results show that, compared with traditional end-to-end vehicle recognition methods based on deep learning, the proposed method can effectively improve the accuracy of recognition while using less training data, and the three-dimensional physical sizes of vehicles can also be obtained simultaneously.

Key words:deep learning;intelligent transportation;three-dimensional bounding box;three-dimensional standardized spatial data;fine-grained recognition of vehicles

参考文献

[1] Chen XZ, Kundu K, Zhang ZY, et al. Monocular 3D object detection for autonomous driving. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2147–2156. [doi: 10.1109/CVPR.2016.236]

[2] Taghvaeeyan S, Rajamani R. Portable roadside sensors for vehicle counting, classification, and speed measurement. IEEE Transactions on Intelligent Transportation Systems, 2014, 15(1): 73–83. [doi: 10.1109/TITS.2013.2273876

[3] Sochor J, Juránek R, Herout A. Traffic surveillance camera calibration by 3D model bounding box alignment for accurate vehicle speed measurement. Computer Vision and Image Understanding, 2017, 161: 87–98. [doi: 10.1016/j.cviu.2017.05.015

[4] 武非凡, 宋焕生, 戴喆, 等. 交通监控场景下的相机标定与车辆速度测量. 计算机应用研究, 2020, 37(8): 2417–2421. [doi: 10.19734/j.issn.1001-3695.2019.03.0089

[5] Liu XC, Liu W, Mei T, et al. PROVID: Progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 2018, 20(3): 645–658. [doi: 10.1109/TMM.2017.2751966

[6] 关济民. 高速公路联网收费稽查管理系统设计与实现[硕士学位论文]. 南京: 南京理工大学, 2018.

[7] Duan K, Parikh D, Crandall D, et al. Discovering localized attributes for fine-grained recognition. Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012. 3474–3481. [doi: 10.1109/CVPR.2012.6248089]

[8] Berg T, Belhumeur PN. POOF: Part-based one-vs.-one features for fine-grained categorization, face verification, and attribute estimation. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 955–962. [doi: 10.1109/CVPR.2013.128]

[9] Deng J, Krause J, Li FF. Fine-grained crowdsourcing for fine-grained recognition. Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013. 580–587. [doi: 10.1109/CVPR.2013.81]

[10] Zhang BL. Reliable classification of vehicle types based on cascade classifier ensembles. IEEE Transactions on Intelligent Transportation Systems, 2013, 14(1): 322–332. [doi: 10.1109/TITS.2012.2213814

[11] Llorca DF, Colás D, Daza IG, et al. Vehicle model recognition using geometry and appearance of car emblems from rear view images. Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems. Qingdao: IEEE, 2014. 3094–3099. [doi: 10.1109/ITSC.2014.6958187]

[12] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 779–788. [doi: 10.1109/CVPR.2016.91]

[13] Qi CR, Liu W, Wu CX, et al. Frustum PointNets for 3D object detection from RGB-D data. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 918–927. [doi: 10.1109/CVPR.2018.00102]

[14] Zhou Y, Tuzel O. VoxelNet: End-to-end learning for point cloud based 3D object detection. Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 4490–4499. [doi: 10.1109/CVPR.2018.00472]

[15] Fei ZG, Guo JJ, Wang JD, et al. The application of laser and CCD compound measuring method on 3D object detection. Proceedings of 2010 IEEE International Conference on Mechatronics and Automation. Xi’an: IEEE, 2010. 1199–1202. [doi: 10.1109/ICMA.2010.5588049]

[16] Zia MZ, Stark M, Schiele B, et al. Detailed 3D representations for object recognition and modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2608–2623. [doi: 10.1109/TPAMI.2013.87

[17] Krause J, Jin HL, Yang JC, et al. Fine-grained recognition without part annotations. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. 5546–5555. [doi: 10.1109/CVPR.2015.7299194]

[18] Chabot F, Chaouch M, Rabarisoa J, et al. Deep MANTA: A coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 1827–1836. [doi: 10.1109/CVPR.2017.198]

[19] Zhang ZX, Tan TN, Huang KQ, et al. Three-dimensional deformable-model-based localization and recognition of road vehicles. IEEE Transactions on Image Processing, 2012, 21(1): 1–13. [doi: 10.1109/TIP.2011.2160954

[20] Corral-Soto ER, Elder JH. Slot cars: 3D modelling for improved visual traffic analytics. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017. 889–897. [doi: 10.1109/CVPRW.2017.123]

[21] Prokaj J, Medioni G. 3-D model based vehicle recognition. Proceedings of 2009 Workshop on Applications of Computer Vision. Snowbird: IEEE, 2009. 1–7. [doi: 10.1109/WACV.2009.5403032]

[22] Zapletal D, Herout A. Vehicle re-identification for automatic video traffic surveillance. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Las Vegas: IEEE, 2016. 25–31. [doi: 10.1109/CVPRW.2016.195]

[23] Sochor J, Špaňhel J, Herout A. BoxCars: Improving fine-grained recognition of vehicles using 3-D bounding boxes in traffic surveillance. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(1): 97–108. [doi: 10.1109/TITS.2018.2799228

[24] 王伟, 张朝阳, 唐心瑶, 等. 道路场景下相机自动标定及优化算法. 计算机辅助设计与图形学学报, 2019, 31(11): 1955–1962. [doi: 10.3724/SP.J.1089.2019.17737

[25] Kanhere NK, Birchfield ST. A taxonomy and analysis of camera calibration methods for traffic monitoring applications. IEEE Transactions on Intelligent Transportation Systems, 2010, 11(2): 441–452. [doi: 10.1109/TITS.2010.2045500

[26] 王伟, 唐心瑶, 张朝阳, 等. 跨相机交通场景下的车辆空间定位方法. 计算机辅助设计与图形学学报, 2021, 33(6): 873–882. [doi: 10.3724/SP.J.1089.2021.18612]

[27] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017. 2980–2988. [doi: 10.1109/ICCV.2017.322]

[28] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778. [doi: 10.1109/CVPR.2016.90]

[29] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. Proceedings of 2015 IEEE International Conference on Computer Vision. Boston: IEEE, 2015. 1–9. [doi: 10.1109/CVPR.2015.7298594]

[30] Sochor J, Juránek R, Špaňhel J, et al. Comprehensive data set for automatic single camera visual speed measurement. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(5): 1633–1643. [doi: 10.1109/TITS.2018.2825609

[31] Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009. 248–255. [doi: 10.1109/CVPR.2009.5206848]

[32] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe: Curran Associates Inc., 2012. 1097–1105.

[33] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556, 2015.

引用本文

王伟,唐心瑶,田尚伟,梅占涛.单目视觉下基于逆投影空间的车辆细粒度识别.计算机系统应用,2022,31(2):22-30

复制

文章指标

点击次数:1005
下载次数: 1707
HTML阅读次数: 1567
引用次数: 0

历史

收稿日期:2020-12-04
最后修改日期:2021-01-14
录用日期:
在线发布日期: 2022-01-28
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码