We propose, and justify, an economic theory to guide memory system design, operation, and analysis. Our theory treats memory random-access latency, and its cost per installed mega...
This paper describes the synchronization and communication primitives of the Cray T3E multiprocessor, a shared memory system scalable to 2048 processors. We discuss what we have l...
Abstract—Execution of applications on upcoming highperformance computing (HPC) systems introduces a variety of new challenges and amplifies many existing ones. These systems will...
Avneesh Pant, Hassan Jafri, Volodymyr V. Kindraten...
In this paper we describe a compiler framework which can identify communication patterns for MPIbased parallel applications. This has the potential of providing significant perfo...
Massive data transfer encountered in emerging multimedia embedded applications requires architecture allowing both highly distributed memory structure and multiprocessor computati...
Sang-Il Han, Amer Baghdadi, Marius Bonaciu, Soo-Ik...