本文已被:浏览 1895次 下载 2164次
Received:November 04, 2017 Revised:November 27, 2017
Received:November 04, 2017 Revised:November 27, 2017
中文摘要: C4.5算法是一种非常有影响力的决策树生成算法,但该方法生成的决策树分类精度不高,分支较多,规模较大.针对C4.5算法存在的上述问题,本文提出了一种基于粗糙集理论与CAIM准则的C4.5改进算法.该算法采用基于CAIM准则的离散化方法对连续属性进行处理,使离散化过程中的信息丢失程度降低,提高分类精度.对离散化后的样本用基于粗糙集理论的属性约简方法进行属性约简,剔除冗余属性,减小生成的决策树规模.通过实验验证,该算法可以有效提高C4.5算法生成的决策树分类精度,降低决策树的规模.
Abstract:As a decision tree generated algorithm, C4.5 algorithm is very influential. But the decision tree classification by C4.5 algorithm is of less accuracy, more branches, and larger scale. To solve these problems, we propose a C4.5 improved algorithm based on rough set theory and CAIM criterion. The algorithm uses the discretization method based on CAIM criterion to process the continuous attributes, which decreases the information loss degree and improve the classification accuracy in discretization. The discretized sample is reduced by attribute reduction method based on rough set theory, which eliminates the redundant attribute and trims the size of decision tree. Experiments show that the algorithm can effectively improve the classification accuracy of decision tree generated by C4.5 algorithm and reduce the scale of decision tree.
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
于宏涛,贾宇波.基于粗糙集理论与CAIM准则的C4.5改进算法.计算机系统应用,2018,27(7):139-144
YU Hong-Tao,JIA Yu-Bo.C4.5 Improved Algorithm Based on Rough Set Theory and CAIM Criterion.COMPUTER SYSTEMS APPLICATIONS,2018,27(7):139-144
于宏涛,贾宇波.基于粗糙集理论与CAIM准则的C4.5改进算法.计算机系统应用,2018,27(7):139-144
YU Hong-Tao,JIA Yu-Bo.C4.5 Improved Algorithm Based on Rough Set Theory and CAIM Criterion.COMPUTER SYSTEMS APPLICATIONS,2018,27(7):139-144