Abstract. For many applications, cache misses are the primary performance bottleneck. Even though much research has been performed on automatically optimizing cache behavior at the...
Next generation tiled microarchitectures are going to be limited by off-chip misses and by on-chip network usage. Furthermore, these platforms will run an heterogeneous mix of ap...
As dual-core and quad-core processors arrive in the marketplace, the momentum behind CMP architectures continues to grow strong. As more and more cores/threads are placed on-die, ...
Li Zhao, Ravi R. Iyer, Ramesh Illikkal, Donald New...
We describe cache architecture, intended for prototype-oriented IC platforms, that automatically finds the best cache configuration for a particular application. The cache itself ...
This paper presents a two-part study on managing distributed NUCA (Non-Uniform Cache Architecture) L2 caches in a future manycore processor to obtain high singlethread program per...