The development of compiler-based mechanisms to reduce the percentage of hotspots and optimize the thermal profile of large register files has become an important issue. Thermal...
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...
The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris ...
William Blume, Rudolf Eigenmann, Keith Faigin, Joh...
The most popular architecture for parallel search is work stealing: threads that have run out of work (nodes to be searched) steal from threads that still have work. Work stealing ...
Complex shaders must be partitioned into multiple passes to execute on GPUs with limited hardware resources. Automatic partitioning gives rise to an NP-hard scheduling problem tha...