Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...
The increasing gap in processor and memory speeds has forced microprocessors to rely on deep cache hierarchies to keep the processors from starving for data. For many applications...
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Future parallel computers must efficiently execute not only hand-coded applications but also programs written in high-level, parallel programming languages. Today's machines ...
Steven K. Reinhardt, James R. Larus, David A. Wood
Address re-mapping techniques in so-called active memory systems have been shown to dramatically increase the performance of applications with poor cache and/or communication beha...
Dhiraj D. Kalamkar, Mainak Chaudhuri, Mark Heinric...