本文已被:浏览 615次 下载 1701次
Received:December 16, 2021 Revised:January 29, 2022
Received:December 16, 2021 Revised:January 29, 2022
中文摘要: 为了有效地整合文本中的复杂特征和提取不同的上下文信息, 提出了基于门控图注意力网络的归纳式文本分类方法(TextIGAT). 该方法首先为语料库中的每个文档进行单独构图, 并将其中所有的单词作为图中的节点, 以此保留完整的文本序列. 文本图中设计单向连接的文档节点, 使词节点能与全局信息交互, 并合并不同的上下文关系连接词节点, 从而在单个文本图中引入更多的文本信息. 然后, 方法基于图注意力网络(GAT)和门控循环单元(GRU)来更新词节点的表示, 并根据图中保留的文本序列应用双向门控循环单元(Bi-GRU)来增强节点的顺序表示. TextIGAT能灵活地整合来自文本本身的信息, 因此能对包含新词和关系的文本进行归纳式学习. 在4个基准数据集(MR、Ohsumed、R8、R52)上的大量实验和详细分析表明了所提出的方法在文本分类任务上的有效性.
Abstract:To effectively integrate complex features in text and extract different contextual information, this study proposes an inductive text classification method based on a gated graph attention network (TextIGAT). This method constructs a graph structure for each document in the corpus and takes all the words as nodes in the graph to preserve the complete text sequence. One-way connected document-level nodes are designed in the text graph, so that word nodes can interact with global information, and different contextual connection word nodes are merged to introduce more text information in a single text graph. Then, the representations of word nodes are updated utilizing a graph attention network (GAT) and a gated recurrent unit (GRU), and the sequential representation of nodes is enhanced by a bi-directional gated recurrent unit (Bi-GRU) according to the text sequence retained in the graph. TextIGAT can flexibly integrate information from text, which thus allows inductive learning on text with new words and relations. Extensive experiments on four benchmark datasets (MR, Ohsumed, R8, and R52) and detailed analysis prove the effectiveness of our proposed method on text classification.
keywords: text classification graph neural network (GNN) contextual information inductive learning natural language processing (NLP)
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(61772210)
引用文本:
王晨曦,张莹祺.基于门控图注意力网络的归纳式文本分类.计算机系统应用,2022,31(9):201-209
WANG Chen-Xi,ZHANG Ying-Qi.Inductive Text Classification Based on Gated Graph Attention Network.COMPUTER SYSTEMS APPLICATIONS,2022,31(9):201-209
王晨曦,张莹祺.基于门控图注意力网络的归纳式文本分类.计算机系统应用,2022,31(9):201-209
WANG Chen-Xi,ZHANG Ying-Qi.Inductive Text Classification Based on Gated Graph Attention Network.COMPUTER SYSTEMS APPLICATIONS,2022,31(9):201-209