基于金字塔池化权值印记的训练后混合精度量化算法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

陕西省科技厅区域创新能力引导计划(2022QFY01-14)


Post-training Mixed-accuracy Quantization Algorithm Based on Pyramid-pooled Weight Imprinting
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    模型量化方法现已广泛应用于深度神经网络模型快速推理和部署中. 由于训练后量化重新训练所需时间少, 性能损失小而备受研究人员关注, 但现有训练后量化方法在量化过程中大多以理论假设或是固定分配网络层的比特位宽, 导致量化后的网络会出现显著的性能损失, 尤其是在低位情况下. 为了提升训练后量化网络模型的精度, 本文提出一种训练后混合精度量化方法(MSQ), 该方法通过在网络模型每一层后插入一个融合了金字塔池化模块和权值印记技术的任务预测器模块, 来对网络每一层进行准确度估计, 从而评估每一层网络的重要性, 根据重要性评估来确定每一层的量化比特位宽. 实验表明, 本文所提出的MSQ算法在多个流行的网络架构上都优于现有的一些混合精度量化方法, 量化后的网络模型在边缘硬件设备上测试性能更好, 延迟更低.

    Abstract:

    Model quantization is widely used for fast inference and deployment of deep neural network models. Post-training quantization has attracted much attention from researchers due to its reduced retraining time and low performance loss. However, most existing post-training quantization methods rely on theoretical assumptions or use fixed bit-width allocations for network layers during the quantization process, which results in significant performance loss in the quantized network, especially in low-bit scenarios. To improve the accuracy of post-training quantized network models, this study proposes a novel post-training mixed-accuracy quantization method (MSQ). This method estimates the accuracy of each layer of the network by inserting a task predictor module, which incorporates the pyramid pooling module and weight imprinting, after each layer of the network model. With the estimations, it assesses the importance of each layer of the network and determines the quantization bit-width of each layer based on the assessment. Experiments show that the MSQ algorithm proposed in this study outperforms some existing mixed-accuracy quantization methods on several popular network architectures, and the quantized network model tested on edge hardware devices shows better performance and lower latency.

    参考文献
    相似文献
    引证文献
引用本文

张瑞轩,赵宇峰,徐飞,禹婷婷,张乐怡.基于金字塔池化权值印记的训练后混合精度量化算法.计算机系统应用,2024,33(12):161-169

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-05-29
  • 最后修改日期:2024-06-26
  • 录用日期:
  • 在线发布日期: 2024-10-31
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号