Main memory accesses continue to be a significant bottleneck for applications whose working sets do not fit in second-level caches. With the trend of greater associativity in seco...
Abstract—Sharing patterns in shared-memory multiprocessors are the key to performance: uniprocessor latencytolerating techniques such as out-of-order execution and non-blocking c...
Efficiency of synchronization mechanisms can limit the parallel performance of many shared-memory applications. In addition, the ever increasing performance gap between processor...
Memorylatency isbecominganincreasingly importantperformance bottleneck, especially in multiprocessors. One technique for tolerating memory latency is multithreading, whereby we sw...
Clustered microarchitectures are an effective approach to reducing the penalties caused by wire delays inside a chip. Current superscalar processors have in fact a two-cluster mic...
Ramon Canal, Joan-Manuel Parcerisa, Antonio Gonz&a...