Software Pipelining Framework for BW104x
CSTR:
Author:
  • Article
  • | |
  • Metrics
  • |
  • Reference [7]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    The digital signal processor (Digital Signal Processing, DSP) is widely used in the field of signal processing, digital communication. The majority of modern high-performance DSP use long instruction word architecture, by exploiting instruction-level parallelism to launch multiple instructions at the same clock cycle out for a higher level of calculating performance. The article describes target system characteristics on BWDSP104x, BWDSP104x is designed in the light of high performance computing and processor, uses 16 launch, single instruction stream and multiple data stream architecture.in order to make full use of multi-cluster hardware resources, this paper proposes the back-end optimization about software pipelining based on the open-source compiler named Open64. Including the early stage of cycle options, resource constraints and precedence constraints computing, the classic Module-Scheduling algorithm is used in SWP-Scheduling, module variable expansion is for the conflict of registers using in different iteration. The experimental results show that program has better performance after software pipelining optimization.

    Reference
    1 Ferreira R, Denver W, Pereira M, et al. A dynamic modulo scheduling with binary translation: Loop optimization with software compatibility. Journal of Signal Processing Systems, 2015: 1-22.
    2 Zalamea J, Llosa J, Ayguadé E, et al. Register constrained modulo scheduling. Parallel & Distributed Systems IEEE Trans. on, 2004, 15(5): 417-430.
    3 Sanchez FJ, Gonzalez A. Cache sensitive modulo scheduling Proc. of the 30th Annual ACM/IEEE International Symposium on Microarchitecture. IEEE Computer Society. 1997. 338-348.
    4 Scheuch M, Höper D, Beer M. RIEMS: A software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets. BMC Bioinformatics, 2015, 16(1): 69.
    5 Cheng X, Tan M, et al. Compiling method and device for realizing loop instruction scheduling based on modulo scheduling. WIPO Patent Application, WO/2012/155442. 2012.
    6 雷一鸣,洪一,徐云,姜海涛.一种基于寄存器压力的VLIW DSP分簇算法.计算机应用,2010,30(1).
    7 Cho SW, Kim JK, Park SJ, et al. 4X framer/deframer module for PCI-express and PCI-express framer/deframer device using the same. United States Patent Application US, 20080162767, 2008.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

洪立涛,郑启龙.面向BW104x软流水框架.计算机系统应用,2016,25(10):114-119

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 17,2016
  • Revised:March 22,2016
  • Online: October 22,2016
Article QR Code
You are the first991254Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063