###
计算机系统应用英文版:2016,25(11):136-140
本文二维码信息
码上扫一扫!
基于改进的CHI统计方法在文本分类中的应用
(福州大学 经济与管理学院, 福州 350108)
Application of Text Categorization Based on Improved CHI-Square Statistic Method
(Department of Economics and Management, Fuzhou University, Fuzhou 350108, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1790次   下载 2070
Received:February 18, 2016    Revised:March 22, 2016
中文摘要: 随着文本分类技术的发展与成熟,越来越多的企业将其应用到客户投诉分类系统中,并获得了一定的效果.针对传统卡方统计方法偏向于选择出负相关低频噪音词,将改进的CHI统计方法运用到文本特征选择,通过降低负相关低频词在特征选择算法中的权重,减小其对模型的影响.最后,对某省通信公司的业务投诉文本进行实验,结果表明该模型和方法是有效的,能更准确地对业务投诉工单进行分类,从而为后续问题的分析提供数据支持.
Abstract:With the development and maturity of text classification technology, more and more enterprises have applied it to the customer complaint classification system, and obtained the certain effect. Given that the CHI-square Statistic methods tend to choose negative words, so an improved CHI statistical method is applied to the text feature selection, which means reducing the weight of negative words in the feature selection algorithm and minimizing the impact on the model. Finally, an experiment is performed on the complaint text of a communications company business. The result shows that the model and method are effective, and can be more accurate for the classification of business complaints, so as to provide data support for the follow-up problem analysis.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
黄章树,叶志龙.基于改进的CHI统计方法在文本分类中的应用.计算机系统应用,2016,25(11):136-140
HUANG Zhang-Shu,YE Zhi-Long.Application of Text Categorization Based on Improved CHI-Square Statistic Method.COMPUTER SYSTEMS APPLICATIONS,2016,25(11):136-140