BiTransformer Memory for Multi-agent Reinforcement Learning

doi:10.15888/j.cnki.csa.009705

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-4- 9

Home > Archive>Volume 33, Issue 12, 2024 >115-122. DOI:10.15888/j.cnki.csa.009705

PDF HTML XML Export Cite reminder

BiTransformer Memory for Multi-agent Reinforcement Learning
DOI:
                        10.15888/j.cnki.csa.009705
                    
CSTR:
                        32024.14.csa.009705
                    
Author:
                        MA Yu-BoMA Yu-Bo
College of Artificial Intelligence, Dalian Maritime University, Dalian 116026, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHOU Chang-DongZHOU Chang-Dong
College of Artificial Intelligence, Dalian Maritime University, Dalian 116026, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG Zhi-WenZHANG Zhi-Wen
College of Artificial Intelligence, Dalian Maritime University, Dalian 116026, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
YANG Pei-ZeYANG Pei-Ze
College of Artificial Intelligence, Dalian Maritime University, Dalian 116026, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG BoZHANG Bo
College of Artificial Intelligence, Dalian Maritime University, Dalian 116026, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Multi-agent collaboration plays a crucial role in the field of reinforcement learning, focusing on how agents cooperate to achieve common goals. Most collaborative multi-agent algorithms emphasize the construction of collaboration but overlook the reinforcement of individual decision-making. To address this issue, this study proposes an online reinforcement learning model, BiTransformer memory (BTM), which not only considers the collaboration among multiple agents but also uses a memory module to assist individual decision-making. The BTM model is composed of a BiTransformer encoder and a BiTransformer decoder, which are utilized to improve individual decision-making and collaboration within the multi-agent system, respectively. Inspired by human reliance on historical decision-making experience, the BiTransformer encoder introduces a memory attention module to aid current decisions with a library of explicit historical decision-making experience rather than hidden units, differing from the conventional RNN-based method. Additionally, an attention fusion module is proposed to process partial observations with the assistance of historical decision experience, to obtain the most valuable information for decision-making from the environment, thereby enhancing the decision-making capabilities of individual agents. In the BiTransformer decoder, two modules are proposed: a decision attention module and a collaborative attention module. They are used to foster potential cooperation among agents by considering the collaborative benefits between other decision-making agents and the current agent, as well as partial observations with historical decision-making experience. BTM is tested in multiple scenes of StarCraft, achieving an average win rate of 93%.

Key words:multi-agent collaboration;online reinforcement learning;partial observation;historical decision-making experience;collaborative benefit;individual policy enhancement

Get Citation

马裕博,周长东,张志文,杨培泽,张博.双注意力记忆多智能体强化学习.计算机系统应用,2024,33(12):115-122

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:May 22,2024
Revised:June 17,2024
Adopted:
Online: October 31,2024
Published:

Article QR Code

You are the first990552Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063