本文已被:浏览 3610次 下载 2092次
Received:March 20, 2019 Revised:April 17, 2019
Received:March 20, 2019 Revised:April 17, 2019
中文摘要: 随着人民生活水平的不断提高,肿瘤疾病的人数在不断增多,其中肺癌是21世纪严重危害人类健康的重大疾病.为此提出一种基于电子病历的肺癌诊断决策树方法.首先分析肺癌电子病历的特点以及决策树存在结构不稳定、过拟合等现象,运用主成分分析法结合C5.0算法构建的优化决策树模型.首先,建立主成分特征根大于1以及主成分累计贡献率大于85%的特征降维两种方法,然后通过C5.0算法建立决策树模型和剪枝操作,最后给出数据预处理过程及模型的执行流程和测试结果.实验结果分析,改进的算法有较好的准确率以及良好的可扩展性,从而验证了改进后的算法对于辅助肺癌临床实验具有重要的意义.
Abstract:With the continuous improvement of people's living standards, the number of cancer diseases is increasing. Among them, lung cancer is a major disease that seriously endangers human health in the 21st century. This paper presents a decision tree method for lung cancer diagnosis based on electronic medical records. Firstly, the characteristics of lung cancer electronic medical records and the instability and over-fitting of the model tree in the decision tree are analyzed. The optimal decision tree model constructed by principal component analysis combined with C5.0 algorithm is used. Firstly, two methods of feature dimension reduction with principal component eigenvalue greater than 1 and principal component cumulative contribution rate greater than 85% are established. Then, the decision tree model and pruning operation are established by C5.0 algorithm. Finally, the data preprocessing process and model are given. The experimental results show that the improved algorithm has better accuracy and good scalability, which proves that the improved algorithm is of great significance for the clinical trial of lung cancer.
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
冯云霞,张润.基于电子病历的肺癌诊断决策树算法.计算机系统应用,2019,28(10):257-263
FENG Yun-Xia,ZHANG Run.Decision Tree Algorithms for Lung Cancer Diagnosis Based on Electronic Medical Record.COMPUTER SYSTEMS APPLICATIONS,2019,28(10):257-263
冯云霞,张润.基于电子病历的肺癌诊断决策树算法.计算机系统应用,2019,28(10):257-263
FENG Yun-Xia,ZHANG Run.Decision Tree Algorithms for Lung Cancer Diagnosis Based on Electronic Medical Record.COMPUTER SYSTEMS APPLICATIONS,2019,28(10):257-263