This paper investigates the performance and power dissipation of Globally Asynchronous Locally Synchronous (GALS) multi-processor systems. We show that communication loops are a s...
The distribution of resources among processors, memory and caches is a crucial question faced by designers of large-scale parallel machines. If a machine is to solve problems with...
Abstract. This paper gives an overview of locality enhancement techniques used by the Jasmine compiler, currently under development at the University of Toronto. These techniques e...
Tarek S. Abdelrahman, Naraig Manjikian, Gary Liu, ...
Two broad classes of memory models are available today: models with hardware cache coherence, used in conventional chip multiprocessors, and models that rely upon software to mana...
John H. Kelm, Daniel R. Johnson, William Tuohy, St...
In programming high performance applications, shared address-space platforms are preferable for fine-grained computation, while distributed address-space platforms are more suita...