This paper investigates the performance and power dissipation of Globally Asynchronous Locally Synchronous (GALS) multi-processor systems. We show that communication loops are a source of significant throughput degradation in communications links and that there is no degradation whatsoever under certain conditions for one-way links, and that it is possible to design GALS multi-processors without this performance penalty. Independent clock domains and unbalanced computation in the GALS multiprocessor allow scaling of the clock frequency and supply voltage to achieve high energy efficiency. The synchronization overhead between independent clock domains results in a less than 1% performance reduction compared to a globally synchronous system over a number of DSP and numerical applications. Clock and voltage scaling can achieve an approximately 40% power savings with no reduction of performance. These results compare favorably with the 25% power savings and more than 10% performance red...
Zhiyi Yu, Bevan M. Baas