面向自动驾驶的高效视图转换

doi:10.15888/j.cnki.csa.009758

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月9日 21:53 星期三

首页 > 过刊浏览>2025年第34卷第2期 >246-253. DOI:10.15888/j.cnki.csa.009758

PDF HTML阅读 XML下载导出引用引用提醒

面向自动驾驶的高效视图转换
DOI:
                        10.15888/j.cnki.csa.009758
                    
CSTR:
                        
                    
作者:
                        刘家辉刘家辉
福建师范大学 计算机与网络空间安全学院, 福州 350117;中国科学院 福建物质结构研究所 泉州装备制造研究中心, 泉州 362216;中国科学院大学 福建学院, 福州 350002
在期刊界中查找
在百度中查找
在本站中查找
官敬超官敬超
福州大学 先进制造学院, 泉州 362251;中国科学院 福建物质结构研究所 泉州装备制造研究中心, 泉州 362216;中国科学院大学 福建学院, 福州 350002
在期刊界中查找
在百度中查找
在本站中查找
方鸿清方鸿清
福建师范大学 计算机与网络空间安全学院, 福州 350117;中国科学院 福建物质结构研究所 泉州装备制造研究中心, 泉州 362216;中国科学院大学 福建学院, 福州 350002
在期刊界中查找
在百度中查找
在本站中查找
巢建树巢建树
中国科学院 福建物质结构研究所 泉州装备制造研究中心, 泉州 362216;中国科学院大学 福建学院, 福州 350002
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:福厦泉国家自主创新示范区协同创新平台项目(2023FX0002)

Efficient View Transformation for Autonomous Driving

Author:

LIU Jia-Hui
LIU Jia-Hui
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350117, China;Quanzhou Institute of Equipment Manufacturing, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Quanzhou 362216, China;Fujian College, University of Chinese Academy of Sciences, Fuzhou 350002, China
在期刊界中查找
在百度中查找
在本站中查找
GUAN Jing-Chao
GUAN Jing-Chao
School of Advanced Manufacturing, Fuzhou University, Quanzhou 362251, China;Quanzhou Institute of Equipment Manufacturing, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Quanzhou 362216, China;Fujian College, University of Chinese Academy of Sciences, Fuzhou 350002, China
在期刊界中查找
在百度中查找
在本站中查找
FANG Hong-Qing
FANG Hong-Qing
College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350117, China;Quanzhou Institute of Equipment Manufacturing, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Quanzhou 362216, China;Fujian College, University of Chinese Academy of Sciences, Fuzhou 350002, China
在期刊界中查找
在百度中查找
在本站中查找
CHAO Jian-Shu
CHAO Jian-Shu
Quanzhou Institute of Equipment Manufacturing, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Quanzhou 362216, China;Fujian College, University of Chinese Academy of Sciences, Fuzhou 350002, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

在自动驾驶技术的领域中, 利用鸟瞰图(bird’s eye view, BEV)进行3D目标检测任务已经引起了广泛的关注. 针对现有相机至鸟瞰视图转换方法, 实时性不足、部署复杂度较高的难题, 提出了一种简单高效、无需任何特殊工程操作即可部署的视图转换方法. 首先, 针对完整图像特征存在大量冗余信息, 引入宽度特征提取器并辅以单目3D检测任务, 提炼图像的关键特征, 确保过程中信息损失的最小化; 其次, 提出一种特征引导的极坐标位置编码方法, 增强相机视角与鸟瞰图表示之间的映射关系与模型空间理解能力; 最后, 通过单层交叉注意力机制实现可学习BEV嵌入与宽度图像特征的交互, 从而生成高质量的BEV特征. 实验结果表明: 在nuScenes验证集上该网络架构与LSS (lift, splat, shoot)相比mAP从29.5%提升到32.0%, 提升了8.5%, NDS从37.1%提升到38.0%, 提升了2.4%, 表明该模型在自动驾驶场景下的3D目标检测任务的有效性. 同时相比于LSS在延迟上降低了41.12 %.

关键词:自动驾驶;鸟瞰图;视图转换;目标检测;交叉注意力

Abstract:

In autonomous driving, the task of using bird’s eye view (BEV) for 3D object detection has attracted significant attention. Existing camera-to-BEV transformation methods are facing challenges of insufficient real-time performance and high deployment complexity. To address these issues, this study proposes a simple and efficient view transformation method that can be deployed without any special engineering operations. First, to address the redundancy in complete image features, a width feature extractor is introduced and supplemented by a monocular 3D detection task to refine the key features of the image. In this way, the minimal information loss in the process can be ensured. Second, a feature-guided polar coordinate positional encoding method is proposed to enhance the mapping relationship between the camera view and the BEV representation, as well as the spatial understanding of the model. Lastly, the study has achieved the interaction between learnable BEV embeddings and width image features through a single-layer cross-attention mechanism, thus generating high-quality BEV features. Experimental results show that, compared to lift, splat, shoot (LSS), on the nuScenes validation set, this network structure improves mAP from 29.5% to 32.0%, an increase of 8.5%, and NDS from 37.1% to 38.0%, an increase of 2.4%. This demonstrates the effectiveness of the model in 3D object detection tasks in autonomous driving scenarios. Additionally, compared to LSS, it reduces latency by 41.12%.

Key words:autonomous driving;bird’s eye view (BEV);view transformation;object detection;cross attention

引用本文

刘家辉,官敬超,方鸿清,巢建树.面向自动驾驶的高效视图转换.计算机系统应用,2025,34(2):246-253

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-07-12
最后修改日期:2024-08-01
录用日期:
在线发布日期: 2024-11-15
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码