In order to meet the high throughput requirements of applications exhibiting high ILP, VLIW ASIPs may increasingly include large numbers of functional unitsFUs. Unfortunately, `switching' data through register les shared by large numbers of FUs quickly becomes a dominant cost performance factor suggesting that clustering smaller number of FUs around local register les may be bene cial even if data transfers are required among clusters. With such machines in mind, we propose a compiler transformation, predicated switching, which enables aggressive speculation while leveraging the penalties associated with inter-cluster communication to achieve gains in performance. Based on representative benchmarks, we demonstrate that this novel technique is particularly suitable for application speci c clustered machines aimed at supporting high ILP as compared to stateof-the-art approaches.
Margarida F. Jacome, Gustavo de Veciana, Satish Pi