###

计算机系统应用英文版:2022,31(2):88-95

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于多智能体强化学习的无人机群室内辅助救援

郭天昊, 张钢, 岳文渊, 王倩, 郭大波

(山西大学物理电子工程学院, 太原 030006)

Indoor Assisted Rescue by UAV Group Based on Multi-agent Reinforcement Learning

GUO Tian-Hao, ZHANG Gang, YUE Wen-Yuan, WANG Qian, GUO Da-Bo

(College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 641次下载 1348次
Received:April 14, 2021 Revised:May 11, 2021

中文摘要: 本文主要研究了在室内场景中使用多台无人机设备对受害者进行合作搜索的问题. 在室内场景中, 依赖全球定位系统获取受害者位置信息可能是不可靠的. 为此, 本文提出一种基于多智能体强化学习(MARL)方案, 该方案着重对无人机团队辅助救援时的路径规划问题进行研究. 相比于传统方案, 所提方案在大型室内救援场景中更具优势, 例如部署多台救援无人机、救援多位受害者. 本方案也考虑了无人机的充电问题, 保证无人机的电量始终充足. 具体地, 鉴于模型中的救援场景深度参数不断变化, 所提方案将搜索路径规划问题模拟为部分可观的马尔可夫决策过程(Dec-POMDP), 为使得对无人机控制策略最优, 本文又训练了一个双深度的Q网络架构(Double DQN). 最后使用蒙特卡罗方法验证了本方案在大型室内环境中能够使多台无人机有效合作, 且能最大化搜集受害者所用手机内部所存储的位置信息.

中文关键词: 无人机室内救援路径规划马尔可夫决策蒙特卡洛

Abstract:This work mainly studies the problem of using multiple unmanned aerial vehicles (UAVs) to search for victims cooperatively in indoor scenes where the location information of victims relying on the global positioning system may be unreliable. To this end, this study proposes a multi-agent reinforcement learning (MARL) based solution which focuses on the path planning studies when the UAV team assists the rescue. Compared with the traditional solution, the proposed solution has advantages in large-scale indoor rescue scenes, such as deploying multiple rescue UAVs and rescuing multiple victims. At the same time, this solution also considers the charging problem of the UAVs to ensure that the power of the UAVs is always sufficient. Specifically, due to the continuous changes of the rescue scene depth parameters in the model, the proposed solution simulates the path planning as a decentralized partially observable Markov decision process (Dec-POMDP). To optimize the UAV control strategy, this study also trains a double deep Q-learning network (Double DQN). Finally, the Monte Carlo method is used to verify that this solution can effectively cooperate with multiple UAVs in a large-scale indoor environment and maximize the collection of the location information stored in the mobile phone used by the victim.

keywords: unmanned aerial vehicle (UAV) indoor rescue path planning Markov decision Monte Carlo

文章编号： 中图分类号： 文献标志码：

基金项目:山西省基础研究项目（201801D121118）

引用文本：
郭天昊,张钢,岳文渊,王倩,郭大波.基于多智能体强化学习的无人机群室内辅助救援.计算机系统应用,2022,31(2):88-95
GUO Tian-Hao,ZHANG Gang,YUE Wen-Yuan,WANG Qian,GUO Da-Bo.Indoor Assisted Rescue by UAV Group Based on Multi-agent Reinforcement Learning.COMPUTER SYSTEMS APPLICATIONS,2022,31(2):88-95

Author Name	Affiliation	E-mail
GUO Tian-Hao	College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China	tianhao_guo@sxu.edu.cn
ZHANG Gang	College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
YUE Wen-Yuan	College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
WANG Qian	College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China
GUO Da-Bo	College of Physics and Electronic Engineering, Shanxi University, Taiyuan 030006, China