This paper explores an application-specific customization technique for the data cache, one of the foremost area/power consuming and performance determining microarchitectural fea...
We propose an organization for the on-chip memory system of a chip multiprocessor, in which 16 processors share a 16MB pool of 256 L2 cache banks. The L2 cache is organized as a n...
Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhan...
Future embedded systems are expected to use chip-multiprocessors to provide the execution power for increasingly demanding applications. Multiprocessors increase the pressure on th...
Martin Schoeberl, Wolfgang Puffitsch, Benedikt Hub...
Cache memories were invented to decouple fast processors from slow memories. However, this decoupling is only partial, and many researchers have attempted to improve cache use by p...
High-performance processors employ aggressive speculation and prefetching techniques to increase performance. Speculative memory references caused by these techniques sometimes br...
Onur Mutlu, Hyesoon Kim, David N. Armstrong, Yale ...