Future CMPs will combine many simple cores with deep cache hierarchies. With more cores, cache resources per core are fewer, and must be shared carefully to avoid poor utilization...
Junli Gu, Steven S. Lumetta, Rakesh Kumar, Yihe Su...
Recent research shows that the high occupancy of Coherence Controllers (CCs) is a major performance bottleneck in scalable shared-memory multiprocessors. In this paper, we propose...
Virtual caches are employed as L1 caches of both high performance and embedded processors to meet their short latency requirements. However, they also introduce the synonym proble...
Recent progress in worst case timing analysis of programs has made it possible to perform accurate timing analysis of pipelined execution and instruction caching, which is necessa...
Programming specialized network processors (NPU) is inherently difficult. Unlike mainstream processors where architectural features such as out-of-order execution and caches hide ...