Optimization of LAPACK Based on Loongson 3A
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    According to the characteristics of Loongson 3A architecture, this paper shows three ways to improve the performance of LAPACK: optimization of the underlying BLAS library, the selection of the best block size of the block algorithm in LAPACK and optimization of the specific LAPACK functions. By running the LAPACK Timing Programs, experimental results are obtained and it shows that the performance of 240 LAPACK functions, which account for 81% of all the LAPACK Timing Programs, is increased by more than 30%.

    Reference
    Related
    Cited by
Get Citation

张斌,顾乃杰,何颂颂,刘斌斌.基于龙芯3A 的LAPACK 函数优化.计算机系统应用,2012,21(11):63-67

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 27,2012
  • Revised:May 18,2012
  • Adopted:
  • Online:
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063