Research and Application of Text Classification Based on Improved Random Forest Algorithm
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Traditional random forest classification algorithm cannot distinguish the strong and weak classifiers by using the majority voting rule, and the value of its hyperparameter needs to be adjusted and optimized. This work studies the application technology of random forest algorithm in text classification and its advantages and disadvantages, and optimizes it. On one hand, optimize the voting method, perform weighted voting by combining classification effect and prediction probability of decision tree. On the other hand, an algorithm combining random search and grid search is proposed to optimize the hyperparameters in random forest. The experimental results in python environment show that the proposed method has sound performance in text classification.

    Reference
    Related
    Cited by
Get Citation

刘勇,兴艳云.基于改进随机森林算法的文本分类研究与应用.计算机系统应用,2019,28(5):220-225

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 23,2018
  • Revised:December 12,2018
  • Adopted:
  • Online: May 05,2019
  • Published: May 15,2019
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063