基于SPER-TD3的无人机编队三维航迹规划
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


3D Trajectory Planning for Unmanned Aerial Vehicle Formation Based on SPER-TD3
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    复杂地形条件下, 基于深度强化学习的无人机编队航迹规划可以完成无人机编队的轨迹寻优, 路径长度和环境适应性均优于传统启发式算法, 但仍存在训练稳定性不足、规划实时性差等问题. 面向领航者-跟随者模式的无人机集群, 本文提出了一种基于SPER-TD3算法的无人机编队实时三维航迹规划方法. 首先, 将基于SumTree的优先经验回放机制融入TD3算法, 设计了SPER-TD3算法, 确定无人机编队的轨迹; 然后, 使用基于角度队形控制方法优化跟随者的飞行轨迹, 并应用动态轨迹平滑算法优化转向角. 为了加快SPER-TD3算法的训练收敛速度和稳定性, 解决长时间依赖性问题, 设计了结合LSTM、自注意力机制以及多重感知机的网络模型结构. 在多种障碍物环境下进行了仿真实验, 结果表明, 所提方法在轨迹安全覆盖率、飞行路径平滑度、成功率、奖励大小等方面综合表现优于8种主流的深度强化学习算法, 其重要性综合评估值比当前方法提升8.5%–72.9%不等, 且训练稳定性最佳.

    Abstract:

    In complex terrain conditions, UAV formation path planning based on deep reinforcement learning can optimize the path of UAV formation, with better path length and environmental adaptability than traditional heuristic algorithms. However, it still has problems such as insufficient training stability and poor real-time planning. For UAV clusters with a leader-follower mode, this study proposes a real-time 3D path planning method for UAV formation based on the SPER-TD3 algorithm. Firstly, the prioritized experience replay mechanism based on SumTree is integrated into the TD3 algorithm, and the SPER-TD3 algorithm is designed to determine the path of the UAV formation. Then, an angle formation control method is used to optimize the path of the followers, and a dynamic path smoothing algorithm is applied to optimize the steering angle. To accelerate the training convergence speed and stability of the SPER-TD3 algorithm, and solve the long-term dependence problem, a network model structure combining LSTM, self-attention mechanism, and multiple perceptrons is designed. Simulation experiments are conducted in environments with various obstacles. Results show that the method mentioned above is superior to eight mainstream deep reinforcement learning algorithms in terms of path safety coverage rate, flight path smoothness, success rate, and reward size. Its comprehensive evaluation value of importance is 8.5% to 72.9% higher than existing methods, and it has the best training stability.

    参考文献
    相似文献
    引证文献
引用本文

彭博,王晓波,魏祥麟,成洁,秦华旺,范建华.基于SPER-TD3的无人机编队三维航迹规划.计算机系统应用,,():1-13

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-10
  • 最后修改日期:2024-08-01
  • 录用日期:
  • 在线发布日期: 2024-12-19
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号