一种深度学习批规范化改进算法
作者:
基金项目:

广东省教育厅重点科研平台项目(2017GWTSCX064)


Improved Batch Normalization Algorithm for Deep Learning
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [25]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    现实中采集的数据由于需要适应实际工程需求以及数据细粒度信息的分类形式多样,样本数据间很难保持完全的独立同分布.而非独立同分布数据会严重降低深度神经网络模型训练的鲁棒性以及特定任务上的泛化性能.为了降低非独立同分布数据在模型训练和推断过程中的不良影响,提出一种批规范化的改进算法.该算法在神经网络模型训练开始前从数据集中取出一小批量数据做批规范化,求解出的均值与方差作为参考值用来更新训练时的其他批量数据.实验结果表明,该改进算法一定程度上能够加快神经网络模型训练收敛,相对于BN算法,分类错误率降低了0.3%,提高了神经网络模型训练的鲁棒性.在目标检测和实例分割任务上,应用该改进算法的预训练模型能够有效提高某些检测算法的泛化性能.

    Abstract:

    It is needed to be adapted to the actual engineering requirements and the classification of the fine-grained data when we collect and annotate data. However, It is difficult to maintain complete independent and identical distribution between the samples. The non-i.i.d data seriously reduce the training’s robustness of deep neural network model and the generalization performance of specific tasks. In order to overcome the shortcomings, this study proposes an improved algorithm of batch normalization, which normalizes a fix reference batch to calculate its mean and variance when the model training started, and then, the statistics of the reference batch is used to update other batches. Experimental results show that the proposed algorithm can accelerate the training convergence speed of the neural network model, meanwhile, the classification error is reduced by 0.3% compared with the BN algorithm. On the other hand, the robustness of neural network model and the generalization performance of some detection frameworks like object detection or instance segmentation are also improved effectively.

    参考文献
    [1] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7):1527-1554.[doi:10.1162/neco.2006.18.7.1527
    [2] Deng J, Dong W, Socher R, et al. ImageNet:A large-scale hierarchical image database. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA. 2009. 248-255.
    [3] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, NV, USA. 2012. 1097-1105.
    [4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
    [5] Szegedy C, Liu W, Jia YQ, et al. Going deeper with convolutions. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA. 2015. 1-9.
    [6] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770-778.
    [7] Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks. Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA. 2017. 2261-2269.
    [8] Howard AG, Zhu ML, Chen B, et al. MobileNets:Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.
    [9] Zhang XY, Zhou XY, Lin MX, et al. ShuffleNet:An extremely efficient convolutional neural network for mobile devices. arXiv:1707.01083, 2017.
    [10] Ma NN, Zhang XY, Zheng HT, et al. ShuffleNet V2:Practical guidelines for efficient CNN architecture design. Proceedings of the 15th European Conference on Computer Vision. Munich, Germany. 2018. 122-138.
    [11] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. International Conference on Learning Representations. arXiv:1412.6572, 2015.
    [12] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision:A survey. IEEE Access, 2018, 6:14410-14430.[doi:10.1109/ACCESS.2018.2807385
    [13] LeCun YL, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.[doi:10.1109/5.726791
    [14] Ioffe S, Szegedy C. Batch normalization:Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning. Lille, France. 2015. 448-456.
    [15] Ren SQ, He KM, Girshick RB, et al. Faster R-CNN:Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.[doi:10.1109/TPAMI.2016.2577031
    [16] He KM, Gkioxari G, Dollár P, et al. Mask R-CNN. Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy. 2017. 2980-2988.
    [17] Ioffe S. Batch renormalization:Towards reducing minibatch dependence in batch-normalized models. Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, CA, USA. 2017. 1945-1953.
    [18] Ba JL, Kiros JR, Hinton GE. Layer normalization. arXiv:1607.06450, 2016.
    [19] Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization:The missing ingredient for fast stylization. arXiv:1607.08022, 2017.
    [20] Luo P, Ren JM, Peng ZL, et al. Differentiable learning-to-normalize via switchable normalization. arXiv:1806.10779, 2018.
    [21] Wu YX, He KM. Group normalization. arXiv:1803.08494, 2018.
    [22] Sutskever I, Martens J, Dahl G, et al. On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta, GA, USA. 2013. III-1139-III-1147.
    [23] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 2011, 12:2121-2159
    [24] He KM, Zhang XY, Ren SQ, et al. Delving deep into rectifiers:surpassing human-level performance on ImageNet classification. Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society. Santiago, Chile. 2015. 1026-1034.
    [25] Su D, Zhang H, Chen HG, et al. Is robustness the cost of accuracy?——A comprehensive study on the robustness of 18 deep image classification models. arXiv:1808.01688, 2018.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

罗国强,李家华,左文涛.一种深度学习批规范化改进算法.计算机系统应用,2020,29(4):187-194

复制
分享
文章指标
  • 点击次数:1305
  • 下载次数: 1960
  • HTML阅读次数: 1282
  • 引用次数: 0
历史
  • 收稿日期:2019-09-05
  • 最后修改日期:2019-10-08
  • 在线发布日期: 2020-04-09
  • 出版日期: 2020-04-15
文章二维码
您是第11249421位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号