Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
A multiprocessor virtual machine benefits its guest operating system in supporting scalable job throughput and request latency--useful properties in server consolidation where ser...
Massive data transfer encountered in emerging multimedia embedded applications requires architecture allowing both highly distributed memory structure and multiprocessor computati...
Sang-Il Han, Amer Baghdadi, Marius Bonaciu, Soo-Ik...
This paper presents a new methodology for implementing fast synchronization on scalable cache-coherent multiprocessors, through the use of hybrid primitives. Hybrid primitives lev...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodor...
Abstract—This paper describes an algorithm for deriving data and computation partitions on scalable shared memory multiprocessors. The algorithm establishes affinity relationshi...