Abstract:To address the challenges posed by dynamic environments to simultaneous localization and mapping (SLAM), this study proposes a detection-first tightly-coupled LiDAR-visual-inertial SLAM system that integrates a LiDAR, a camera, and an inertial measurement unit (IMU). First, semantically labeled point-cloud clusters are obtained through the fusion of images and point-cloud information. Then, a tracking algorithm is used to acquire the motion state information of targets. Subsequently, the tracked dynamic targets are utilized to eliminate redundant feature points. Finally, a factor graph is adopted to jointly optimize IMU pre-integration and achieve tight coupling between LiDAR and visual odometry. To validate the performance of the proposed SLAM framework, experiments are conducted on both public datasets (KITTI and UrbanNav) and real-world data. Experimental results demonstrate that in highly dynamic and normal scenarios in public datasets, compared with the LeGO-LOAM, LIO-SAM, and LVI-SAM algorithms, the proposed algorithm reduces the root mean square error (RMSE) by 44.56% (4.47 m) and 4.15% (4.62 m), respectively. Real-world testing confirms that the algorithm effectively mitigates the direct impact of dynamic objects on map construction.