一种深度学习批规范化改进算法

doi:10.15888/j.cnki.csa.007347

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月4日 3:31 星期五

首页 > 过刊浏览>2020年第29卷第4期 >187-194. DOI:10.15888/j.cnki.csa.007347

PDF HTML阅读 XML下载导出引用引用提醒

一种深度学习批规范化改进算法
DOI:
                        10.15888/j.cnki.csa.007347
                    
CSTR:
                        
                    
作者:
                        罗国强罗国强
广州科技职业技术大学 信息工程学院, 广州 510550
在期刊界中查找
在百度中查找
在本站中查找
李家华李家华
广州科技职业技术大学 信息工程学院, 广州 510550
在期刊界中查找
在百度中查找
在本站中查找
左文涛左文涛
广州工商学院 计算机科学与工程系, 广州 510850
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:广东省教育厅重点科研平台项目（2017GWTSCX064）

Improved Batch Normalization Algorithm for Deep Learning

Author:

LUO Guo-Qiang
LUO Guo-Qiang
College of Information Engineering, Guangzhou Vocational and Technical University of Science and Technology, Guangzhou 510550, China
在期刊界中查找
在百度中查找
在本站中查找
LI Jia-Hua
LI Jia-Hua
College of Information Engineering, Guangzhou Vocational and Technical University of Science and Technology, Guangzhou 510550, China
在期刊界中查找
在百度中查找
在本站中查找
ZUO Wen-Tao
ZUO Wen-Tao
College of Computer Science and Engineering, Guangzhou College of Technology and Business, Guangzhou 510850, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [25]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

现实中采集的数据由于需要适应实际工程需求以及数据细粒度信息的分类形式多样，样本数据间很难保持完全的独立同分布.而非独立同分布数据会严重降低深度神经网络模型训练的鲁棒性以及特定任务上的泛化性能.为了降低非独立同分布数据在模型训练和推断过程中的不良影响，提出一种批规范化的改进算法.该算法在神经网络模型训练开始前从数据集中取出一小批量数据做批规范化，求解出的均值与方差作为参考值用来更新训练时的其他批量数据.实验结果表明，该改进算法一定程度上能够加快神经网络模型训练收敛，相对于BN算法，分类错误率降低了0.3%，提高了神经网络模型训练的鲁棒性.在目标检测和实例分割任务上，应用该改进算法的预训练模型能够有效提高某些检测算法的泛化性能.

关键词:深度学习;批规范化;独立同分布;鲁棒性;泛化性

Abstract:

It is needed to be adapted to the actual engineering requirements and the classification of the fine-grained data when we collect and annotate data. However, It is difficult to maintain complete independent and identical distribution between the samples. The non-i.i.d data seriously reduce the training’s robustness of deep neural network model and the generalization performance of specific tasks. In order to overcome the shortcomings, this study proposes an improved algorithm of batch normalization, which normalizes a fix reference batch to calculate its mean and variance when the model training started, and then, the statistics of the reference batch is used to update other batches. Experimental results show that the proposed algorithm can accelerate the training convergence speed of the neural network model, meanwhile, the classification error is reduced by 0.3% compared with the BN algorithm. On the other hand, the robustness of neural network model and the generalization performance of some detection frameworks like object detection or instance segmentation are also improved effectively.

Key words:deep learning;batch normalization;non-i.i.d;robustness;generalization

参考文献

[1] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554.[doi:10.1162/neco.2006.18.7.1527

[2] Deng J, Dong W, Socher R, et al. ImageNet:A large-scale hierarchical image database. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA. 2009. 248-255.

[3] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA. 2012. 1097-1105.

[4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.

[5] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 1-9.

[6] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770-778.

[7] Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 2261-2269.

[8] Howard AG, Zhu ML, Chen B, et al. MobileNets:Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.

[9] Zhang XY, Zhou XY, Lin MX, et al. ShuffleNet:An extremely efficient convolutional neural network for mobile devices. arXiv:1707.01083, 2017.

[10] Ma NN, Zhang XY, Zheng HT, et al. ShuffleNet V2:Practical guidelines for efficient CNN architecture design. Proceedings of the 15th European Conference on Computer Vision. Munich, Germany. 2018. 122-138.

[11] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. International Conference on Learning Representations. arXiv:1412.6572, 2015.

[12] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision:A survey. IEEE Access, 2018, 6:14410-14430.[doi:10.1109/ACCESS.2018.2807385

[13] LeCun YL, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.[doi:10.1109/5.726791

[14] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning. Lille, France. 2015. 448-456.

[15] Ren SQ, He KM, Girshick RB, et al. Faster R-CNN:Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.[doi:10.1109/TPAMI.2016.2577031

[16] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy. 2017. 2980-2988.

[17] Ioffe S. Batch renormalization:Towards reducing minibatch dependence in batch-normalized models. Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, CA, USA. 2017. 1945-1953.

[18] Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv:1607.06450, 2016.

[19] Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization:The missing ingredient for fast stylization. arXiv:1607.08022, 2017.

[20] Luo P, Ren JM, Peng ZL, et al. Differentiable learning-to-normalize via switchable normalization. arXiv:1806.10779, 2018.

[21] Wu YX, He KM. Group normalization. arXiv:1803.08494, 2018.

[22] Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta, GA, USA. 2013. III-1139-III-1147.

[23] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 2011, 12:2121-2159

[24] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers:surpassing human-level performance on ImageNet classification. Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society. Santiago, Chile. 2015. 1026-1034.

[25] Su D, Zhang H, Chen HG, et al. Is robustness the cost of accuracy?——A comprehensive study on the robustness of 18 deep image classification models. arXiv:1808.01688, 2018.

引用本文

罗国强,李家华,左文涛.一种深度学习批规范化改进算法.计算机系统应用,2020,29(4):187-194

复制

文章指标

点击次数:1305
下载次数: 1960
HTML阅读次数: 1282
引用次数: 0

历史

收稿日期:2019-09-05
最后修改日期:2019-10-08
录用日期:
在线发布日期: 2020-04-09
出版日期: 2020-04-15

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码