Short Text Classification Based on Multi-Factors Affecting Features Selection
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Feature Selection (FS) is reducing dimensions and denoising. However, there are many factors that affect the features selection, mainly including the dimensions, importance, and semantic of terms. For feature representing high-dimensional but sparse of short text and traditional features extraction lack semantic, a feature selection function FS fusing multi-factors is constructed. It is verified that FS not only can integrate the semantics of the feature, but also can remove a large number of redundant features, thus improve the weight of the features with class distinction capabilities, comparing with the traditional feature selection function TF-IDF. FS as a new function, using the Chinese corpus of Sogou Lab for short text classification, verifys the effectiveness of the method.

    Reference
    Related
    Cited by
Get Citation

李文慧,张英俊,潘理虎.多因素影响特征选择的短文本分类方法.计算机系统应用,2018,27(12):216-221

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 04,2018
  • Revised:May 24,2018
  • Adopted:
  • Online: December 05,2018
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063