Multi-modal Fusion for 3D Object Detection in Dusty Wilderness

doi:10.15888/j.cnki.csa.009762

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-24- 16

Home > Archive>Volume 34, Issue 2, 2025 >92-101. DOI:10.15888/j.cnki.csa.009762

PDF HTML XML Export Cite reminder

Multi-modal Fusion for 3D Object Detection in Dusty Wilderness
DOI:
                        10.15888/j.cnki.csa.009762
                    
CSTR:
                        
                    
Author:
                        YANG Wen-HaoYANG Wen-Hao
School of Computer Science and Technology, North University of China, Taiyuan 030051, China;Shanxi Key Laboratory of Machine Vision and Virtual Reality, Taiyuan 030051, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
KUANG Li-QunKUANG Li-Qun
School of Computer Science and Technology, North University of China, Taiyuan 030051, China;Shanxi Key Laboratory of Machine Vision and Virtual Reality, Taiyuan 030051, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WANG SongWANG Song
School of Computer Science and Technology, North University of China, Taiyuan 030051, China;Shanxi Key Laboratory of Machine Vision and Virtual Reality, Taiyuan 030051, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG JueZHANG Jue
National Key Laboratory of Intelligent Mining Equipment and Technology, Taiyuan 030032, China;Shanxi Taizhong Intelligent Mining Equipment and Technology Co. Ltd., Taiyuan 030032, China;Shanxi Key Laboratory of Machine Vision and Virtual Reality, Taiyuan 030051, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [33]

Related [20]

Cited by

Materials

Comments

Abstract:

It is a significant challenge for high-precision 3D object detection for autonomous vehicles equipped with multiple sensors in the dusty wilderness. The variable wilderness terrain aggravates the regional feature differences of detected objects. Additionally, dust particles can blur the object features. To address these issues, this study proposes a 3D object detection method based on multi-modal feature dynamic fusion and constructs a multi-level feature self-adaptive fusion module and a feature alignment augmentation module. The former module dynamically adjusts the model’s attention to global-level features and regional-level features, leveraging multi-level receptive fields to reduce the impact of regional variances on recognition performance. The latter module bolsters the feature representation of regions of interest before multi-modal feature alignment, effectively suppressing interference factors such as dust. Experimental results show that compared with the average precision of the baseline, that of this approach is improved by 2.79% in the self-built wilderness dataset and by 1.7% in the hard-level test of the KITTI dataset. This shows our method has good robustness and precision.

Key words:3D object detection;wilderness;dust;multi-modal fusion;point cloud

Reference

[1] Mao JG, Shi SS, Wang XG, et al. 3D object detection for autonomous driving: A comprehensive survey. International Journal of Computer Vision, 2023, 131(8): 1909–1963.

[2] 葛同澳, 李辉, 郭颖, 等. 基于双融合框架的多模态3D目标检测算法. 电子学报, 2023, 51(11): 3100–3110.

[3] 霍威乐, 荆涛, 任爽. 面向自动驾驶的三维目标检测综述. 计算机科学, 2023, 50(7): 107–118.

[4] Guo JY, Kurup U, Shah M. Is it safe to drive? An overview of factors, metrics, and datasets for driveability assessment in autonomous driving. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(8): 3135–3151.

[5] Hahner M, Sakaridis C, Bijelic M, et al. LiDAR snowfall simulation for robust 3D object detection. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 16343–16353.

[6] Hahner M, Sakaridis C, Dai DX, et al. Fog simulation on real LiDAR point clouds for 3D object detection in adverse weather. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021. 15263–15272.

[7] Vattem T, Sebastian G, Lukic L. Rethinking LiDAR object detection in adverse weather conditions. Proceedings of the 2022 International Conference on Robotics and Automation. Philadelphia: IEEE, 2022. 5093–5099.

[8] 陈易男. 自动驾驶场景中基于单目图像的三维目标检测研究 [硕士学位论文]. 杭州: 浙江大学, 2023.

[9] 张冬冬, 郭杰, 陈阳. 基于原始点云的三维目标检测算法. 计算机工程与应用, 2023, 59(3): 209–217.

[10] Chen YK, Li YW, Zhang XY, et al. Focal sparse convolutional networks for 3D object detection. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 5418–5427.

[11] 刘越, 刘芳, 武奥运, 等. 基于自注意力机制与图卷积的3D目标检测网络. 计算机应用, 2024, 44(6): 1972–1977.

[12] Wang YJ, Mao QY, Zhu HQ, et al. Multi-modal 3D object detection in autonomous driving: A survey. International Journal of Computer Vision, 2023, 131(8): 2122–2152.

[13] 彭湃, 耿可可, 王子威, 等. 智能汽车环境感知方法综述. 机械工程学报, 2023, 59(20): 281–303.

[14] Yeong DJ, Velasco-Hernandez G, Barry J, et al. Sensor and sensor fusion technology in autonomous vehicles: A review. Sensors, 2021, 21(6): 2140.

[15] Vora S, Lang AH, Helou B, et al. PointPainting: Sequential fusion for 3D object detection. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 4603–4611.

[16] Xu SQ, Zhou DF, Fang J, et al. FusionPainting: Multimodal fusion with adaptive attention for 3D object detection. Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference. Indianapolis: IEEE, 2021. 3047–3054.

[17] Chen XZ, Ma HM, Wan J, et al. Multi-view 3D object detection network for autonomous driving. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 6526–6534.

[18] Chen XY, Zhang TY, Wang Y, et al. FUTR3D: A unified sensor fusion framework for 3D detection. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Vancouver: IEEE, 2023. 172–181.

[19] Mai NAM, Duthon P, Khoudour L, et al. 3D object detection with SLS-fusion network in foggy weather conditions. Sensors, 2021, 21(20): 6711.

[20] Sindagi VA, Zhou Y, Tuzel O. MVX-Net: Multimodal VoxelNet for 3D object detection. Proceedings of the 2019 International Conference on Robotics and Automation. Montreal: IEEE, 2019. 7276–7282.

[21] Zhang YN, Chen JX, Huang D. CAT-Det: Contrastively augmented Transformer for multimodal 3D object detection. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 898–907.

[22] Ku J, Mozifian M, Lee J, et al. Joint 3D proposal generation and object detection from view aggregation. Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid: IEEE, 2018. 1–8.

[23] Bai XY, Hu ZY, Zhu XG, et al. TransFusion: Robust LiDAR-camera fusion for 3D object detection with Transformers. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 1080–1089.

[24] Li X, Ma T, Hou YN, et al. LoGoNet: Towards accurate 3D object detection with local-to-global cross-modal fusion. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. 17524–17534.

[25] Huang TT, Liu Z, Chen XW, et al. EPNet: Enhancing point features with image semantics for 3D object detection. Proceedings of the 16th European Conference on Computer Vision. Glasgow: Springer, 2020. 35–52.

[26] Wu XP, Peng L, Yang HH, et al. Sparse fuse dense: Towards high quality 3D detection with depth completion. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 5408–5417.

[27] Yang LX, Zhang RY, Li LD, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks. Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021. 11863–11874.

[28] Reynolds JH, Chelazzi L. Attentional modulation of visual processing. Annual Review of Neuroscience, 2004, 27: 611–647.

[29] Deng JJ, Shi SS, Li PW, et al. Voxel R-CNN: Towards high performance voxel-based 3D object detection. Proceedings of the 35th AAAI Conference on Artificial Intelligence. AAAI, 2021. 1201–1209.

[30] Yan Y, Mao YX, Li B. SECOND: Sparsely embedded convolutional detection. Sensors, 2018, 18(10): 3337.

[31] Shi SS, Guo CX, Jiang L, et al. PV-RCNN: Point-voxel feature set abstraction for 3D object detection. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 10526–10535.

[32] Wu H, Deng JH, Wen CL, et al. CasA: A cascade attention network for 3-D object detection from LiDAR point clouds. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5704511.

[33] Li YC, Li ZX, Teng SY, et al. AutoMine: An unmanned mine dataset. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. 21276–21285.

Get Citation

杨文浩,况立群,王松,张珏.多模态融合的野外扬尘环境三维目标检测.计算机系统应用,2025,34(2):92-101

Copy

Article Metrics

Abstract:114
PDF: 398
HTML: 89
Cited by: 0

History

Received:July 04,2024
Revised:August 01,2024
Adopted:
Online: December 13,2024
Published:

Article QR Code

You are the first992108Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063