﻿ 结合批归一化的多层感知机糖尿病预测诊断模型
1. 厦门理工学院 计算机与信息工程学院, 厦门 361024;
2. 上海工程技术大学 电子电气工程学院, 上海 201620

Multi-Layer Perceptron Diabetes Prediction Model Combined with Batch Normalization
HU Qing-Li1, HU Jian-Qiang1, YU Xiao-Yan2
1. School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China;
2. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
Abstract: The early detection of diabetes is of great significance for successful control of diabetes, prevention of complications, and reduction of prevalence. Existing diabetes diagnosis models based on machine learning have weak precision due to insufficient generalization ability. Therefore, this study proposes a multi-layer perceptron model combined with batch normalization to ensure the consistency of data distribution in the model. The proposed model is based on the PIMA training set for training evaluation. The experimental results show that the model has sound generalization ability in early recognition of diabetes, fast convergence, and high accuracy.
1 引言

2 结合批归一化算法的多层感知机

 图 1 结合批归一化算法的深度神经网络模型

 ${\hat x^{\left( k \right)}} = \frac{{{x^{\left( k \right)}} - E\left[ {{x^{\left( k \right)}}} \right]}}{{\sqrt {Var\left[ {{x^{\left( k \right)}}} \right]} }}$ (1)

 ${y^{\left( k \right)}} = {\gamma ^{\left( k \right)}}{\hat x^{\left( k \right)}} + {\beta ^{\left( k \right)}}$ (2)

 图 2 结合批归一化算法的多层感知机网络结构

 $\mu _{Bi}^l = \frac{1}{m}\sum\limits_{k = 1}^m {y_{ik}^l}$ (3)
 $\sigma {_B^{l2}} = \frac{1}{m}\sum\limits_{k = 1}^m {{{\left( {y_{ik}^l - \mu _{Bi}^l} \right)}^2}}$ (4)
 $\hat y_{ik}^l = \frac{{y_{ik}^l - \mu _{Bi}^l}}{{\sqrt {\sigma {{_B^l}^2} + \varepsilon } }}$ (5)

 $Y_i^l = \gamma _i^l\hat y_{ik}^l + \beta _i^l$ (6)

 $Y_i^{l + 1} = \gamma _i^{l + 1}\frac{{f\left( {{\textit{z}}_i^{l + 1}} \right) - \mu _{Bi}^{l + 1}}}{{\sqrt {{{\left( {\sigma _B^{l + 1}} \right)}^2} + \varepsilon } }} + \beta _i^{l + 1}$ (7)
 ${\textit{z}}_i^{l + 1} = \sum\limits_{j = 1}^N {W_{ji}^l} Y_j^l + b_i^l$ (8)

 $y = g\left( {\sum\limits_{k = 1}^N {W_k^3} Y_k^3 + b_y^3} \right)$ (9)

3 糖尿病诊断模型

 图 3 糖尿病诊断模型建立流程图

(1) 定义网络隐藏层个数及神经元个数. 给定网络的初始隐层层数和神经元个数, 通过后续的训练和测试效果, 对隐层层数和神经元个数做加减, 使得模型效果好.

(2) 开始训练过程, 反复训练, 直到符合要求. 将前期划分出来的训练集输入网络中, 并在训练集中划分0.1作为验证集微调模型的超参数, 直到训练误差达到最小值或误差不再变化, 训练结束.

(3) 将符合要求的网络及训练参数保存下来, 用于测试集. 对第(2)步中训练好的网络, 保存下来, 使用测试集评估网络的性能.

(4) 重复(1)到(3)寻找最优参数组合.

4 实验 4.1 训练数据集

4.2 数据预处理

 ${\textit{z}} = \frac{{x - u}}{s}$ (10)

 图 4 箱型图

 图 5 PIDD中前8个属性的箱型图

4.3 模型训练

4.4 实验验证及结果分析

3组实验训练过程中训练和验证的准确率与损失值的变化如图6所示. 图6(a)中训练和验证的损失值在175轮训练时趋于稳定, 且两者基本重合, 网络拟合效果好; 图6(b)中训练和验证的损失值在300~350轮之间已经稳定, 较(a)组收敛稍慢; 图6(c)中很明显的可以看出, 其训练的收敛速度远远慢于(a)组, 说明结合批归一化算法的多层感知机建立的糖尿病诊断模型收敛速度显著提高.

3组实验的实验结果如表3所示, 分别记录各组实验的训练准确率、测试准确率以及AUC值. 由表中数据可知, (a)组实验效果最好.

5 结束语

 图 6 3组实验训练过程中训练和验证的准确率和损失值的变化

