Abstract:Unmanned aerial vehicles (UAVs) can act as air edge servers to provide services for ground mobile terminals in disaster areas where earthquakes, typhoons, floods, and mudslides have caused severe damage. However, it is difficult to complete complex computationally intensive tasks in real time due to the limited computation and storage capacity of a single UAV. In this study, a multi-UAV-assisted mobile edge computing model is first investigated and a mathematical model is built. Then a partially observable Markov decision process is established and an improved multi-agent deep deterministic policy gradient (MADDPG) algorithm based on the composite priority experiential replay sampling method (CoP-MADDPG) is proposed to jointly optimize time delay, energy consumption, and flight trajectory of UAVs. Finally, the simulation experimental results show that the proposed algorithm outperforms other benchmark algorithms in terms of total reward convergence speed and convergence value, and can provide services for about 90% of ground mobile terminals, proving the effectiveness and practicality of the proposed algorithm.