本文已被:浏览 1503次 下载 2838次
Received:February 18, 2012 Revised:April 03, 2012
Received:February 18, 2012 Revised:April 03, 2012
中文摘要: VLIW DSP 通过软件流水获得时间并行性, 通过指令分簇获得空间并行性. 指令的分簇本质上是资源分配问题. 传统的指令分簇假设一条指令分到某一簇执行, 而某些体系结构提供SIMD 指令, 传统的分簇算法对这类体系结构并不完全适用. 提出的基于评估模型的分簇算法能对SIMD 指令和普通指令进行合理的分簇. 分簇之后, 通过调度簇间传输指令, 合成适当的簇间双字传输指令. 由于SIMD 和簇间双字传输的引入, 以及较好的分簇决策, 程序整体的调度延迟变短. 对许多数字信号处理程序相对于没分簇的情况下的性能
Abstract:VLIW DSP obtain time parallelism through software pipelining, and obtain space parallelism through instruction clustering. The essence of clustering is resource allocation. Traditional clustering assumes that one instruction assigns to certain cluster, but that does not applicable to some architecture offering SIMD instructions. This article proposes an algorithm based on evaluation model can do well with the problem of clustering for ordinary instructions and SIMD instructions. By scheduling inter-cluster transfer instruction, we synthesize inter-cluster double word transfer instruction. With the help of SIMD instruction, inter-cluster double word transfer instruction and good clustering policy decision, we make the schedule latency shorter. For many DSP programs, comparing with no clustering, we obtain 2 ~ 3 times increase in performance, comparing with clustering algorithm based on register allocation, we obtain 7~10% increase in performance.
keywords: SIMD instruction clustering inter-cluster double word transfer instruction scheduling delay DFG
文章编号: 中图分类号: 文献标志码:
基金项目:核高基重大专项(2009ZX01034-001-001-002)
引用文本:
陈思灵,郑启龙,冯玉谦,付和萍.支持SIMD 与簇间双字传输体系下的VLIW DSP 分簇算法.计算机系统应用,2012,21(10):100-104
CHEN Si-Ling,ZHENG Qi-Long,FENG Yu-Qian,FU He-Ping.VLIW DSP Clustering Algorithm for Architecture Supporting SIMD and Inter-Cluster Double Word Transfer.COMPUTER SYSTEMS APPLICATIONS,2012,21(10):100-104
陈思灵,郑启龙,冯玉谦,付和萍.支持SIMD 与簇间双字传输体系下的VLIW DSP 分簇算法.计算机系统应用,2012,21(10):100-104
CHEN Si-Ling,ZHENG Qi-Long,FENG Yu-Qian,FU He-Ping.VLIW DSP Clustering Algorithm for Architecture Supporting SIMD and Inter-Cluster Double Word Transfer.COMPUTER SYSTEMS APPLICATIONS,2012,21(10):100-104