Optimization of CNN Computing Task Partition Based on Many-Core BWDSP
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [14]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Convolutional Neural Network (CNN), which is one of the deep learning algorithms, has been applied in many fields. Because the scale and structure of the network model are complex and the model has large amount of data, it is necessary to reduce the requirements for computational resource. Generally, it needs to use data parallel strategy to partition and calculate tasks with large amount of data. However, just using data parallel strategy which does not combine with the characteristics of computing tasks, it would result in high volume data transmission. Because of that, it is essential to design a reasonable data partitioning strategy for reducing the amount of data transmission through the analysis of the network structure and the computing characteristics of CNN. Firstly, this paper introduces the optimization of computing tasks in deep learning accelerator. Then, it introduces the architecture of the deep learning accelerator based on many-core BWDSP and designs the strategy of computing partition. And it compares and analyzes the experimental results based on VGGNet-16. The experimental results show that the proposed optimization algorithm can significantly improve the performance of data transmission and reduce the amount of data transmission.

    Reference
    [1] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444.[doi:10.1038/nature14539
    [2] Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2014, 115(3):211-252
    [3] Gu JX, Wang ZH, Kuen J, et al. Recent advances in convolutional neural networks. Pattern Recognition, 2018, 77:354-377.[doi:10.1016/j.patcog.2017.10.013
    [4] Le QV. Building high-level features using large scale unsupervised learning. Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada. 2013. 8595-8598.
    [5] Chen TS, Du ZD, Sun NH, et al. DianNao:A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Notices, 2014, 49(4):269-284
    [6] Parashar A, Rhu M, Mukkara A, et al. SCNN:An accelerator for compressed-sparse convolutional neural networks. ACM SIGARCH Computer Architecture News, 2017, 45(2):27-40.[doi:10.1145/3140659
    [7] LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11):2278-2324.[doi:10.1109/5.726791
    [8] CET38. BWDSPl00软件用户手册.合肥:中国电子科技集团第三十八研究所, 2011. 181-191.
    [9] 邓文齐.基于BWDSP的众核深度学习加速器的研究[硕士学位论文].合肥:中国科学技术大学, 2018.
    [10] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA. 2016. 770-778.
    [11] Abadi M, Barham P, Chen JM, et al. Tensorflow:A system for large-scale machine learning. Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. Savannah, GA, USA. 2016. 265-283.
    [12] Jia YQ, Shelhamer E, Donahue J, et al. Caffe:Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, Florida, USA. 2014. 675-678.
    [13] Hillis WD, Steele Jr GL. Data parallel algorithms. Communications of the ACM, 1986, 29(12):1170-1183.[doi:10.1145/7902.7903
    [14] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409. 1556, 2014.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

王改,郑启龙,邓文齐,杨江平,卢茂辉.基于BWDSP众核的CNN计算任务划分优化.计算机系统应用,2019,28(9):88-94

Copy
Share
Article Metrics
  • Abstract:1897
  • PDF: 2276
  • HTML: 1408
  • Cited by: 0
History
  • Received:February 28,2019
  • Revised:March 14,2019
  • Online: September 09,2019
  • Published: September 15,2019
Article QR Code
You are the first990398Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063