Abstract:When the basic Q-learning algorithm is applied to path planning, the randomness of action selection makes the early search efficiency of the algorithm low and the planning time-consuming, and even a complete and feasible path cannot be found. Therefore, a path planning algorithm of robots based on improved ant colony optimization (ACO) and dynamic Q-learning fusion is proposed. The pheromone increment mechanism of the elite ant model and sorting ant model is used, and a new pheromone increment updating method is designed to improve the exploration efficiency of robots. The pheromone matrix of the improved ant colony optimization algorithm is used to assign values to the Q table, so as to reduce the ineffective exploration of the robot at the initial stage. In addition, a dynamic selection strategy is designed to improve the convergence speed and the stability of the algorithm. Finally, different simulation experiments are carried out on two-dimensional static grid maps with different obstacle levels. The results show that the proposed method can effectively reduce the number of iterations and optimization time consumption in the optimization process.