Multi-objective Path Planning Based on Reinforcement Learning

doi:10.15888/j.cnki.csa.009418

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-24- 17

Home > Archive>Volume 33, Issue 3, 2024 >158-169. DOI:10.15888/j.cnki.csa.009418

PDF HTML XML Export Cite reminder

Multi-objective Path Planning Based on Reinforcement Learning
DOI:
                        10.15888/j.cnki.csa.009418
                    
CSTR:
                        [cstr]
                    
Author:
                        ZHOU YiZHOU Yi
School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LIU JunLIU Jun
School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [20]

Cited by

Materials

Comments

Abstract:

The path planning problem for mobile robots involves a large number of nodes and a wide search space. It also considers factors such as safety and real-time requirements. To address the multi-objective path planning problem for mobile robots, this study proposes a novel multi-objective intelligent optimization algorithm that combines reinforcement learning. Firstly, the algorithm adopts NSGA-II as the base framework and equips individuals with learning capabilities by reinforcement learning. A SARSA operator is designed to improve the global search efficiency of the algorithm. Secondly, to accelerate the convergence speed and ensure population diversity, the study introduces an adaptive simulated binary crossover operator (tanh-SBX) as an auxiliary operator and divides the population into two sub-populations with different properties: elite and non-elite populations. Finally, the study designs four different strategies and calculates the probability of updating strategies using the Metropolis criterion of the simulated annealing algorithm. It allows the most suitable strategy to guide the population’s optimization direction, balancing exploration and exploitation. Simulation experiments demonstrate that the proposed algorithm can find optimal paths in environments with different complexities. Compared to traditional intelligent biomimetic algorithms, the proposed algorithm effectively balances optimization objectives and discovers safer and better paths in more complex environments.

Key words:multi-objective path planning;natural heuristic algorithm;reinforcement learning;NSGA-II;mobile robot

Reference

[1] Grimme C, Kerschke P, Aspar P, et al. Peeking beyond peaks: Challenges and research potentials of continuous multimodal multi-objective optimization. Computers & Operations Research, 2021, 136: 105489.

[2] Tan CS, Mohd-Mokhtar R, Arshad MR. A comprehensive review of coverage path planning in robotics using classical and heuristic algorithms. IEEE Access, 2021, 9: 119310–119342.

[3] Oroko JA, Nyakoe GN. Obstacle avoidance and path planning schemes for autonomous navigation of a mobile robot: A review. Proceedings of the 2012 Sustainable Research and Innovation Conference. Kenya, 2012. 314–318.

[4] Singh MK, Choudhary A, Gulia S, et al. Multi-objective NSGA-II optimization framework for UAV path planning in an UAV-assisted WSN. The Journal of Supercomputing, 2023, 79(1): 832–866.

[5] Ajeil FH, Ibraheem IK, Sahib MA, et al. Multi-objective path planning of an autonomous mobile robot using hybrid PSO-MFB optimization algorithm. Applied Soft Computing, 2020, 89: 106076.

[6] Gul F, Rahiman W, Alhady SSN, et al. Meta-heuristic approach for solving multi-objective path planning for autonomous guided robot using PSO-GWO optimization algorithm with evolutionary programming. Journal of Ambient Intelligence and Humanized Computing, 2021, 12(7): 7873–7890.

[7] 刘晓峰, 刘智斌, 董兆安. 基于记忆启发的强化学习方法研究. 计算机技术与发展, 2023, 33(6): 168–172, 180.

[8] 王楷文, 施文. 基于深度强化学习与状态预测的多智能体动态避碰路径规划方法研究. 2022中国自动化大会论文集. 厦门: 中国自动化学会, 2022. 575–580.

[9] Kiran M, Ozyildirim M. Hyperparameter tuning for deep reinforcement learning applications. arXiv:2201.11182, 2022.

[10] Dong MH, Ying FK, Li XJ, et al. Efficient policy learning for general robotic tasks with adaptive dual-memory hindsight experience replay based on deep reinforcement learning. Proceedings of the 7th International Conference on Robotics, Control and Automation (ICRCA). Taizhou: IEEE, 2023. 62–66.

[11] Deb K, Agrawal RB. Simulated binary crossover for continuous search space. Complex Systems, 1995, 9(2): 115–148.

[12] Deb K, Sindhya K, Okabe T. Self-adaptive simulated binary crossover for real-parameter optimization. Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation. London: ACM, 2007. 1187–1194.

[13] 程元栋, 杨齐威, 闫俊. 基于混合自适应精英遗传算法的路径规划研究. 湖北民族大学学报(自然科学版), 2023, 41(1): 51–57, 64.

[14] 孙波, 姜平, 周根荣, 等. 改进遗传算法在移动机器人路径规划中的应用. 计算机工程与应用, 2019, 55(17): 162–168.

[15] 李开荣, 胡倩倩. 融合Bezier遗传算法的移动机器人路径规划. 扬州大学学报(自然科学版), 2021, 24(5): 58–64.

[16] Wang ZJ, Zhan ZH, Kwong S, et al. Adaptive granularity learning distributed particle swarm optimization for large-scale optimization. IEEE Transactions on Cybernetics, 2021, 51(3): 1175–1188.

[17] Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182–197.

[18] Hu YM, Li DC, He YQ, et al. Incremental learning framework for autonomous robots based on Q-learning and the adaptive kernel linear model. IEEE Transactions on Cognitive and Developmental Systems, 2022, 14(1): 64–74.

[19] 李理, 李鸿, 单宁波. 多启发因素改进蚁群算法的路径规划. 计算机工程与应用, 2019, 55(5): 219–225, 250.

[20] Miao CW, Chen GZ, Yan CL, et al. Path planning optimization of indoor mobile robot based on adaptive ant colony algorithm. Computers & Industrial Engineering, 2021, 156: 107230.

Get Citation

周毅,刘俊.融合强化学习的多目标路径规划.计算机系统应用,2024,33(3):158-169

Copy

Article Metrics

Abstract:808
PDF: 2129
HTML: 1147
Cited by: 0

History

Received:September 07,2023
Revised:October 09,2023
Adopted:
Online: December 26,2023
Published:

Article QR Code

You are the first992118Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063