Shrinking process technologies and growing chip sizes have profound effects on process variation. This leads to Chip Multiprocessors (CMPs) where not all cores operate at maximum f...
Major Bhadauria, Vincent M. Weaver, Sally A. McKee
Abstract. On multi-core architectures with software-managed memories, effectively orchestrating data movement is essential to performance, but is tedious and error-prone. In this p...
Lee W. Howes, Anton Lokhmotov, Alastair F. Donalds...
Abstract. Exposing more instruction-level parallelism in out-of-order superscalar processors requires increasing the number of dynamic in-flight instructions. However, large instru...
Abstract. This paper advocates the placement of Architecturally Visible Communication (AVC) buffers between adjacent cores in MPSoCs to provide highthroughput communication for str...
Theo Kluter, Philip Brisk, Edoardo Charbon, Paolo ...
This paper proposes and studies a hardware-based adaptive controlled migration strategy for managing distributed L2 caches in chip multiprocessors. Building on an area-efficient sh...
Abstract. In transactional memory, aborted transactions reduce performance, and waste computing resources. Ideally, concurrent execution of transactions should be optimally ordered...
Abstract. Technological advances and increasingly complex and dynamic application behavior argue for revisiting mechanisms that adapt logical cache block size to application charac...
Matthew A. Watkins, Sally A. McKee, Lambert Schael...
In previous work the 3D-Wave parallelization strategy was proposed to increase the parallel scalability of H.264 video decoding. This strategy is based on the observation that inte...
Arnaldo Azevedo, Cor Meenderinck, Ben H. H. Juurli...
Abstract. Threads experiencing long-latency loads on a simultaneous multithreading (SMT) processor may clog shared processor resources without making forward progress, thereby star...
Kenzo Van Craeynest, Stijn Eyerman, Lieven Eeckhou...
Abstract. Iterative compilation is an efficient approach to optimize programs on rapidly evolving hardware, but it is still only scarcely used in practice due to a necessity to gat...