Short Text Classification Model of GM-FastText Multi-channel Word Vector
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To tackle the problems in short text classification, such as difficult extraction of sparse text features and out of vocabulary (OOV) caused by non-standard words, this study proposes a short text classification model GM-FastText based on the FastText multi-channel embedded word vector and the GRU-MLP hybrid network architecture (GM) built by a gated recurrent unit (GRU) and multi-layer perceptron (MLP). This model uses the FastText model to generate different embedded word vectors in the N-gram mode and feeds them into the GRU layer and MLP layer to obtain short text features. After the extraction of text features by GRU and the hybrid extraction of the text features in different channels in the MLP layer, they are finally mapped to each classification. The experimental results show that compared with TextCNN and TextRNN, the GM-FastText model has an F1 index increased by 0.021 and 0.023 and accuracy by 1.96 and 2.08 percentage points. Moreover, compared with FastText, FastText-CNN and FastText-RNN, the GM-FastText has an F1 index improved by 0.006, 0.014 and 0.016 and accuracy by 0.42, 1.06 and 1.41 percentage points. In short, under the action of FastText multi-channel word vector and GM hybrid structure network, the multi-channel word vector has better word vector expression in short text classification and the GM network structure has better performance for multi-parameter feature extraction.

    Reference
    Related
    Cited by
Get Citation

白子诚,周艳玲,张龑. GM-FastText多通道词向量短文本分类模型.计算机系统应用,2022,31(9):403-408

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 23,2021
  • Revised:December 20,2021
  • Adopted:
  • Online: May 30,2022
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063