本文已被:浏览 1617次 下载 2197次
Received:July 16, 2017 Revised:July 28, 2017
Received:July 16, 2017 Revised:July 28, 2017
中文摘要: 为提高中文文本分类的效果,提出了一种基于粗糙集理论的规则匹配方法.在对文本特征的提取过程中,对CHI统计方法进行了适当的改进,并对特征项的权值进行了缩放和离散化.结合区分矩阵实现关于粗糙集理论的属性约简和规则提取,并采用规则预检验的方法对规则匹配的决策参数进行优化,以提高中文文本分类的效果.实验结果表明改进后的规则匹配方法分类准确率更高,同时在训练数据较少的情况下也可以取得不错的效果.
Abstract:To improve the performance of Chinese text classification, a rule matching method based on rough set theory is proposed in this study. In the extracting process of textual features, the CHI statistical method is improved and the weight of the feature is scaled and discretized. It combines the discriminant matrix to achieve the attribute reduction and rule extraction for rough set theory, and uses rule pre-test method to optimize the decision parameters of rule matching to improve the effect of Chinese text categorization. The experimental results demonstrate that the categorization accuracy of the improved matching method is higher, and in the case of less training data, it can also achieve decent results
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金(11401031);北京信息科技大学2016-2017学年度“实培计划”项目
引用文本:
朱敏玲,吴海艋,石磊.粗糙集规则匹配算法及其在文本分类中的应用.计算机系统应用,2018,27(4):131-137
ZHU Min-Ling,WU Hai-Meng,SHI Lei.Rough Set Rule Matching Method and its Application in Text Categorization.COMPUTER SYSTEMS APPLICATIONS,2018,27(4):131-137
朱敏玲,吴海艋,石磊.粗糙集规则匹配算法及其在文本分类中的应用.计算机系统应用,2018,27(4):131-137
ZHU Min-Ling,WU Hai-Meng,SHI Lei.Rough Set Rule Matching Method and its Application in Text Categorization.COMPUTER SYSTEMS APPLICATIONS,2018,27(4):131-137