As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...
Abstract--Many shared-memory parallel systems use lockbased synchronization mechanisms to provide mutual exclusion or reader-writer access to memory locations. Software locks are i...
Emerging non-volatile memory technologies such as phase change memory (PCM) promise to increase storage system performance by a wide margin relative to both conventional disks and ...
Adrian M. Caulfield, Arup De, Joel Coburn, Todor I...
We propose an architectural design methodology for designing formally verifiable cache coherence protocols, called Fractal Coherence. Properly designed to be fractal in behavior, t...
Bias temperature instability, hot-carrier injection, and gate-oxide wearout will cause severe lifetime degradation in the performance and the reliability of future CMOS devices. Th...
Erika Gunadi, Abhishek A. Sinkar, Nam Sung Kim, Mi...
Soft error reliability is increasingly becoming a first-order design concern for microprocessors, as a result of higher transistor counts, shrinking device geometries and lowering ...
Efficient management of last level caches (LLCs) plays an important role in bridging the performance gap between processor cores and main memory. This paper is motivated by two key...
This paper presents ReMAP, a reconfigurable architecture geared towards accelerating and parallelizing applications within a heterogeneous CMP. In ReMAP, threads share a common rec...
To extend the exponential performance scaling of future chip multiprocessors, improving energy efficiency has become a first-class priority. Single-chip heterogeneous computing ha...
Eric S. Chung, Peter A. Milder, James C. Hoe, Ken ...
Abstract--Parallel programming is hard, because it is impractical to test all possible thread interleavings. One promising approach to improve a multi-threaded program's relia...