﻿ 结合批归一化的多层感知机糖尿病预测诊断模型
 计算机系统应用  2020, Vol. 29 Issue (5): 182-188 PDF

1. 厦门理工学院 计算机与信息工程学院, 厦门 361024;
2. 上海工程技术大学 电子电气工程学院, 上海 201620

Multi-Layer Perceptron Diabetes Prediction Model Combined with Batch Normalization
HU Qing-Li1, HU Jian-Qiang1, YU Xiao-Yan2
1. School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China;
2. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
Foundation item: Natural Science Foundation of Fujian Province (2019J01856); CERNET Innovation Program for Next Generation of Internet (NGII20160708)
Abstract: The early detection of diabetes is of great significance for successful control of diabetes, prevention of complications, and reduction of prevalence. Existing diabetes diagnosis models based on machine learning have weak precision due to insufficient generalization ability. Therefore, this study proposes a multi-layer perceptron model combined with batch normalization to ensure the consistency of data distribution in the model. The proposed model is based on the PIMA training set for training evaluation. The experimental results show that the model has sound generalization ability in early recognition of diabetes, fast convergence, and high accuracy.
Key words: diabetes     machine learning     batch normalization     generalization ability

1 引言

2 结合批归一化算法的多层感知机

 图 1 结合批归一化算法的深度神经网络模型

 ${\hat x^{\left( k \right)}} = \frac{{{x^{\left( k \right)}} - E\left[ {{x^{\left( k \right)}}} \right]}}{{\sqrt {Var\left[ {{x^{\left( k \right)}}} \right]} }}$ (1)

 ${y^{\left( k \right)}} = {\gamma ^{\left( k \right)}}{\hat x^{\left( k \right)}} + {\beta ^{\left( k \right)}}$ (2)

 图 2 结合批归一化算法的多层感知机网络结构

 $\mu _{Bi}^l = \frac{1}{m}\sum\limits_{k = 1}^m {y_{ik}^l}$ (3)
 $\sigma {_B^{l2}} = \frac{1}{m}\sum\limits_{k = 1}^m {{{\left( {y_{ik}^l - \mu _{Bi}^l} \right)}^2}}$ (4)
 $\hat y_{ik}^l = \frac{{y_{ik}^l - \mu _{Bi}^l}}{{\sqrt {\sigma {{_B^l}^2} + \varepsilon } }}$ (5)

 $Y_i^l = \gamma _i^l\hat y_{ik}^l + \beta _i^l$ (6)

 $Y_i^{l + 1} = \gamma _i^{l + 1}\frac{{f\left( {{\textit{z}}_i^{l + 1}} \right) - \mu _{Bi}^{l + 1}}}{{\sqrt {{{\left( {\sigma _B^{l + 1}} \right)}^2} + \varepsilon } }} + \beta _i^{l + 1}$ (7)
 ${\textit{z}}_i^{l + 1} = \sum\limits_{j = 1}^N {W_{ji}^l} Y_j^l + b_i^l$ (8)

 $y = g\left( {\sum\limits_{k = 1}^N {W_k^3} Y_k^3 + b_y^3} \right)$ (9)

3 糖尿病诊断模型

 图 3 糖尿病诊断模型建立流程图

(1) 定义网络隐藏层个数及神经元个数. 给定网络的初始隐层层数和神经元个数, 通过后续的训练和测试效果, 对隐层层数和神经元个数做加减, 使得模型效果好.

(2) 开始训练过程, 反复训练, 直到符合要求. 将前期划分出来的训练集输入网络中, 并在训练集中划分0.1作为验证集微调模型的超参数, 直到训练误差达到最小值或误差不再变化, 训练结束.

(3) 将符合要求的网络及训练参数保存下来, 用于测试集. 对第(2)步中训练好的网络, 保存下来, 使用测试集评估网络的性能.

(4) 重复(1)到(3)寻找最优参数组合.

4 实验 4.1 训练数据集

4.2 数据预处理

 ${\textit{z}} = \frac{{x - u}}{s}$ (10)

 图 4 箱型图

 图 5 PIDD中前8个属性的箱型图

4.3 模型训练

4.4 实验验证及结果分析

3组实验训练过程中训练和验证的准确率与损失值的变化如图6所示. 图6(a)中训练和验证的损失值在175轮训练时趋于稳定, 且两者基本重合, 网络拟合效果好; 图6(b)中训练和验证的损失值在300~350轮之间已经稳定, 较(a)组收敛稍慢; 图6(c)中很明显的可以看出, 其训练的收敛速度远远慢于(a)组, 说明结合批归一化算法的多层感知机建立的糖尿病诊断模型收敛速度显著提高.

3组实验的实验结果如表3所示, 分别记录各组实验的训练准确率、测试准确率以及AUC值. 由表中数据可知, (a)组实验效果最好.

5 结束语

 图 6 3组实验训练过程中训练和验证的准确率和损失值的变化

 [1] 国际糖尿病联盟. 全球糖尿病概览. 8版. 2017. 43. [2] Sisodia D, Sisodia DS. Prediction of diabetes using classification algorithms. Procedia Computer Science, 2018, 132: 1578-1585. DOI:10.1016/j.procs.2018.05.122 [3] 刘阳, 孙华东, 张艳荣, 等. 基于支持向量机的糖尿病预测模型研究. 哈尔滨商业大学学报(自然科学版), 2018, 34(1): 61-65, 74. [4] 胡建强. 一种基于云雾辅助的移动健康监护系统设计. 厦门大学学报(自然科学版), 2019, 58(4): 608-613. [5] Choubey DK, Paul S, Kumar S, et al. Classification of pima Indian diabetes dataset using naive Bayes with genetic algorithm as an attribute selection. Proceedings of International Conference on Communication and Computing Systems. Nanjing, China. 2017. 451–455. [6] Mohapatra SK, Swain JK, Mohanty MN. Detection of diabetes using multilayer perceptron. In Bhaskar MA, Dash SS, Das S, et al, eds. International Conference on Intelligent Computing and Applications. Singapore: Springer. 2019. 109–116. [7] Soltani Z, Jafarian A. A new artificial neural networks approach for diagnosing diabetes disease type II. International Journal of Advanced Computer Science and Applications, 2016, 7(6): 89-94. [8] Ashiquzzaman A, Tushar AK, Islam MR, et al. Reduction of overfitting in diabetes prediction using deep learning neural network. In: Kim KJ, Kim H, Baek N, eds. IT Convergence and Security 2017. Singapore: Springer, 2018. 35–43. [9] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France. 2015. 448–456. [10] 朱威, 屈景怡, 吴仁彪. 结合批归一化的直通卷积神经网络图像分类算法. 计算机辅助设计与图形学学报, 2017, 29(9): 1650-1657. DOI:10.3969/j.issn.1003-9775.2017.09.008 [11] 张德园, 杨柳, 李照奎, 等. BN-cluster: 基于批归一化的集成算法实例分析. 沈阳航空航天大学学报, 2018, 35(3): 72-80. DOI:10.3969/j.issn.2095-1248.2018.03.010 [12] Galea A, Capelo L. Applied Deep Learning with Python: Use Scikit-learn, TensorFlow, and Keras to Create Intelligent Systems and Machine Learning Solutions. Packt Publishing, 2018. [13] Hackeling G. Mastering Machine Learning with Scikit-learn. 2nd ed. Packt Publishing, 2017. [14] UCI. UCI Machine Learning. https://www.kaggle.com/uciml/pima-indians-diabetes-database. 2016. [15] Wickham H, Stryjewski L. 40 years of boxplots. The American Statistician, 2011. [16] 孙艳丰, 杨新东, 胡永利, 等. 基于Softplus激活函数和改进Fisher判别的ELM算法. 北京工业大学学报, 2015, 41(9): 1341-1348.