Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
3D-integration is a promising technology to help combat the “Memory Wall” in future multi-core processors. Past work has considered using 3D-stacked DRAM as a large last-level...
Chip multi-processors (CMP) are rapidly emerging as an important design paradigm for both high performance and embedded processors. These machines provide an important performance...
Alex Settle, Dan Connors, Enric Gibert, Antonio Go...
Cache memories have been extensively used to bridge the gap between high speed processors and relatively slower main memories. However, they are sources of predictability problems...