本文已被:浏览 508次 下载 1540次
Received:June 20, 2022 Revised:July 18, 2022
Received:June 20, 2022 Revised:July 18, 2022
中文摘要: 点积函数是BLAS库中的一级基础函数, 其被科学计算等领域广泛调用. 由于浮点计算会引入舍入误差, 现有BLAS库中双精度点积函数不足以满足某些应用领域的精度要求, 因此需要高精度算法来实现更精确可靠的计算. 在本文中, 面向国产申威1621平台, 在现有的BLAS库的基础上, 新增高精度点积函数的实现接口, 来满足应用的高精度需求. 同时, 对于高精度点积算法运用循环展开、访存优化、指令重排等优化策略, 实现汇编级手工优化. 实验结果显示, 文中高精度点积算法的计算结果精度, 近似达到了双精度点积的两倍, 有效提升了原始算法精度. 同时, 在保证精度提升的基础上, 文中优化后的高精度点积函数相比未优化前, 平均性能加速比达到了1.61.
Abstract:The dot product function is a first-level basic function in the BLAS library, which is widely called by scientific calculations and other fields. As the floating-point calculation introduces rounding errors, the double-precision dot product is unable to meet the accuracy requirements in some application fields, and thus high-precision algorithms are needed to achieve more accurate and reliable calculations. In this study, on the basis of the existing BLAS library, the interface of the high-precision dot product function is added to meet the high-precision requirements of applications on the domestic SW1621 platform. At the same time, the high-precision dot product algorithm uses such optimization strategies as loop expansion, visit-memory optimization, and instruction rearrangement to realize assembly-level manual optimization. The experimental results indicate that the high-precision dot product algorithm has the accuracy approximately twice that of the double-precision dot product, which effectively improves the precision of the original algorithm. On this basis, the average performance speedup of the high-precision dot product function reaches 1.61 after optimization.
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
徐方洁,王磊,王一卓,张亚光.基于申威1621的高精度点积算法实现与优化.计算机系统应用,2023,32(2):400-405
XU Fang-Jie,WANG Lei,WANG Yi-Zhuo,ZHANG Ya-Guang.Implementation and Optimization of High-precision Dot Product Algorithm Based on SW1621 Processor.COMPUTER SYSTEMS APPLICATIONS,2023,32(2):400-405
徐方洁,王磊,王一卓,张亚光.基于申威1621的高精度点积算法实现与优化.计算机系统应用,2023,32(2):400-405
XU Fang-Jie,WANG Lei,WANG Yi-Zhuo,ZHANG Ya-Guang.Implementation and Optimization of High-precision Dot Product Algorithm Based on SW1621 Processor.COMPUTER SYSTEMS APPLICATIONS,2023,32(2):400-405