Optimization of BLAS Level 2 Based on Multi-Core Loongson 3A
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    According to characteristics of Loongson 3A architecture and BLAS level 2, this article derives the parallel solutions from instruction level, storage level and thread level. We summarize some suitable optimization methods and make a quantitative analysis. Experiment shows that the single-threading performance of BLAS level 2 is increased by 20%, and the multi-threading speedup reaches to 2.5. All of these will give some help to the optimization of system software on multi-core Loongson 3A.

    Reference
    Related
    Cited by
Get Citation

李毅,何颂颂,李恺.多核龙芯3A 上二级BLAS 库的优化.计算机系统应用,2011,20(1):163-167

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 29,2010
  • Revised:May 27,2010
  • Adopted:
  • Online:
  • Published:
Article QR Code
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063