本文已被:浏览 2010次 下载 3370次
Received:June 22, 2019 Revised:July 16, 2019
Received:June 22, 2019 Revised:July 16, 2019
中文摘要: 作为深度学习领域中最具有影响力的网络结构之一,卷积神经网络朝着更深更复杂的方向发展,对硬件计算能力提出了更高的要求,随之出现了神经网络专用处理器.为了对这类处理器进行客观比较,并指导软硬件优化设计,本文针对卷积神经网络提出了宏基准测试程序和微基准测试程序.其中,宏基准测试程序包含主流的卷积神经网络模型,用于处理器性能的多方位评估和对比;微基准测试程序包含卷积神经网络中的核心网络层,用于细粒度定位性能瓶颈并指导优化.为了准确描述这套基准测试程序在真实硬件平台上的性能表现,本文选取了I/O等待延迟、跨节点通信延迟和CPU利用率3大系统性能评测指标以及IPC、分支预测、资源竞争和访存表现等微架构性能评测指标.基于评测结果,本文为处理器的硬件设计与架构改进提出了可靠建议.
Abstract:As one of the most influential networks in the field of deep learning, convolutional neural network is deeper and deeper, and proposes higher demand for computing capabilities. Various dedicated processors have emerged. In order to compare such processors fairly and help to optimize software and hardware, this study proposes macrobenchmarks and microbenchmarks for convolutional neural networks. The macrobenchmarks include mainstream convolutional neural networks for evaluating processors, the microbenchmarks include core layers in them for analyzing bottlenecks and guiding optimization. This study characterizes the behaviors of benchmarks from both system and microarchitecture aspects. The system metrics include I/O wait, cross-node communication and CPU utilization, the microarchitecture metrics include IPC, branch prediction, back-end resource competition and memory access. Based on the performance results, this study provides reliable advice for helping optimizing processors.
keywords: convolutional neural network network layer benchmark performance analysis microarchitecture
文章编号: 中图分类号: 文献标志码:
基金项目:国家重点研发计划(2016YFB1000403);中央高校基本科研业务费专项资金(YD2150002001)
引用文本:
徐青青,安虹,武铮,金旭.主流卷积神经网络的硬件设计与性能分析.计算机系统应用,2020,29(2):49-57
XU Qing-Qing,AN Hong,WU Zheng,JIN Xu.Hardware Design and Performance Analysis of Mainstream Convolutional Neural Networks.COMPUTER SYSTEMS APPLICATIONS,2020,29(2):49-57
徐青青,安虹,武铮,金旭.主流卷积神经网络的硬件设计与性能分析.计算机系统应用,2020,29(2):49-57
XU Qing-Qing,AN Hong,WU Zheng,JIN Xu.Hardware Design and Performance Analysis of Mainstream Convolutional Neural Networks.COMPUTER SYSTEMS APPLICATIONS,2020,29(2):49-57