Abstract:VLIW DSP obtain time parallelism through software pipelining, and obtain space parallelism through instruction clustering. The essence of clustering is resource allocation. Traditional clustering assumes that one instruction assigns to certain cluster, but that does not applicable to some architecture offering SIMD instructions. This article proposes an algorithm based on evaluation model can do well with the problem of clustering for ordinary instructions and SIMD instructions. By scheduling inter-cluster transfer instruction, we synthesize inter-cluster double word transfer instruction. With the help of SIMD instruction, inter-cluster double word transfer instruction and good clustering policy decision, we make the schedule latency shorter. For many DSP programs, comparing with no clustering, we obtain 2 ~ 3 times increase in performance, comparing with clustering algorithm based on register allocation, we obtain 7~10% increase in performance.