Optimization of BLAS Level 1 Functions on SW1621 Processor
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The Basic Linear Algebra Subprogram (BLAS) is a mathematical function standard for basic linear algebra operations. The library function is divided into three levels in which basic operations between vector and vector (level 1), vector and matrix (level 2), and vector and vector (level 3) are offered. In this paper, we study the optimization scheme of BLAS level1 functions on SW1621 processor. With the function AXPY as an example, the architectural characteristics of the platform are fully used to optimize its performance, and an automatic thread allocation scheme is designed. The experimental results show that compared with the reference implementation version of GotoBLAS, the optimized BLAS level1 function, AXPY, has a high single-core acceleration ratio of 4.36 and a multi-core one of 9.50 respectively. Every optimization scheme can improve the performance.

    Reference
    Related
    Cited by
Get Citation

李浩然,王磊.基于申威1621处理器的BLAS一级函数优化.计算机系统应用,2021,30(7):246-252

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 07,2020
  • Revised:December 12,2020
  • Adopted:
  • Online: July 02,2021
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063