###

计算机系统应用英文版:2019,28(9):18-24

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

面向聊天机器人的多注意力记忆网络

任建龙^1,2, 杨立^1,2, 孔维一^1,2, 左春^1,2,3

(1.中国科学院软件研究所精准计算联合实验室, 北京 100190;2.中国科学院大学, 北京 100049;3.中科软科技股份有限公司, 北京 100190)

Memory Network with Multi-Head Attention for Chatbot

REN Jian-Long^1,2, YANG Li^1,2, KONG Wei-Yi^1,2, ZUO Chun^1,2,3

(1.Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China;3.SinoSoft Co. Ltd., Beijing 100190, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 3955次下载 2369次
Received:February 27, 2019 Revised:March 22, 2019

中文摘要: 如何对多轮的对话历史进行建模和推理是构建一个智能聊天机器人的主要挑战之一.基于循环或门控的记忆网络已经被证明是进行对话建模的有效方式.然而，这种方式有两个缺点，一是使用复杂的循环结构，导致计算效率较低；二是使用代价较大的强监督信息或先验信息，不利于扩展和迁移应用到新的领域.针对上述问题，本文提出了一种端到端的多注意力记忆网络.首先，该网络采取结合词向量和位置编码的方式对文本输入进行表示；其次，使用并行的多层注意力在不同子空间捕获对话交互中的关键信息来更好地建模对话历史；最后，通过捷径连接的方式叠加多注意力层管理信息流，实现对建模结果的多次推理.在bAbI-dialog数据集上的实验表明，该网络可以有效地对多轮对话进行建模和推理，而且具有较好的时间性能.

中文关键词: 聊天机器人多轮对话多注意力捷径连接

Abstract:Modeling and reasoning about the multi-turn dialogue history is a main challenge for building an intelligent chatbot. Memory Networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from two drawbacks, one is relatively low computational efficiency for its complex architectures, the other is costly strong supervision information or fixed priori knowledge, which hinders its extension and application to new domains. This paper proposes an end-to-end memory network with multi-head attention. Firstly, the model adopts a method using word embedding combined with position encoding to represent text input; Secondly, it uses multi-head attention to capture important information in different subspaces of conversational interactions. Finally, multi-layered attention is stacked via shortcut connections to achieve repeatedly reasoning over the modeling result. Experiments on the bAbI-dialog datasets show that the network can effectively model and reason for multi-turn dialogue and has a better time performance.

keywords: chatbot multi-turn dialogue multi-head attention shortcut connections

文章编号： 中图分类号： 文献标志码：

基金项目:中国科学院A类战略性先导科技专项（XDA20080200）；国家重点研发计划（2018YFB1005002）

Author Name	Affiliation	E-mail
REN Jian-Long	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China	ren_jian_long@163.com
YANG Li	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China
KONG Wei-Yi	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China
ZUO Chun	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China SinoSoft Co. Ltd., Beijing 100190, China

Author Name	Affiliation	E-mail
REN Jian-Long	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China	ren_jian_long@163.com
YANG Li	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China
KONG Wei-Yi	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China
ZUO Chun	Laboratory of Precise Computing, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China University of Chinese Academy of Sciences, Beijing 100049, China SinoSoft Co. Ltd., Beijing 100190, China

引用文本：
任建龙,杨立,孔维一,左春.面向聊天机器人的多注意力记忆网络.计算机系统应用,2019,28(9):18-24
REN Jian-Long,YANG Li,KONG Wei-Yi,ZUO Chun.Memory Network with Multi-Head Attention for Chatbot.COMPUTER SYSTEMS APPLICATIONS,2019,28(9):18-24