Unsupervised Document Representation Learning Based on Hierarchical Attention Model

doi:10.15888/j.cnki.csa.006533

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-24- 20

Home > Archive>Volume 27, Issue 9, 2018 >40-46. DOI:10.15888/j.cnki.csa.006533

PDF HTML XML Export Cite reminder

Unsupervised Document Representation Learning Based on Hierarchical Attention Model
DOI:
                        10.15888/j.cnki.csa.006533
                    
CSTR:
                        [cstr]
                    
Author:
                        OUYANG Wen-JunOUYANG Wen-Jun
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
XU Lin-LiXU Lin-Li
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Many natural language applications need to represent the input text into a fixed-length vector. Existing technologies such as word embeddings and document representation provide natural representation for natural language tasks, but they do not consider the importance of each word in the sentence, and also ignore the significance of a sentence in a document. This study proposes a Document Representation model based on a Hierarchical Attention (HADR) mechanism, taking into account important sentences in document and important words in sentence. Experimental results show that documents that take into account the importance of words and importance of sentences have better performance. The accuracy of this model in the sentiment classification of documents (IMBD) is higher than that of Doc2Vec and Word2Vec models.

Key words:document representation;word embeddings;attention;hierarchical;unsupervised learning;document classification

Get Citation

欧阳文俊,徐林莉.基于层级注意力模型的无监督文档表示学习.计算机系统应用,2018,27(9):40-46

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 17,2018
Revised:February 09,2018
Adopted:
Online: July 26,2018
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063