Abstract:Deep reinforcement learning algorithms are more and more widely used in UAV trajectory planning tasks, but many studies do not consider complex scenarios of random changes. To address the above problems, this study proposes an improved PP-CMNTD3 algorithm based on TD3, which puts forward a simple and effective prior strategy and draws on the idea of artificial potential fields to design dense rewards. UAVs are better guided to effectively avoid obstacles and swiftly approach target points. Simulation results show that the algorithm improvement can effectively improve the training efficiency of the network and the trajectory planning performance in complex scenarios. At the same time, the strategy can be flexibly adjusted under different initial power levels, achieving an effective balance between energy consumption and rapid arrival at the destination.