With processor speeds continuing to outpace the memory subsystem, cache missing memory operations continue to become increasingly important to application performance. In response...
Sorin Iacobovici, Lawrence Spracklen, Sudarshan Ka...
The Cray X1 was recently introduced as the first in a new line of parallel systems to combine high-bandwidth vector processing with an MPP system architecture. Alongside capabili...
Christian Bell, Wei-Yu Chen, Dan Bonachea, Katheri...
Storage mapping optimization is a flexible approach to folding array dimensions in numerical codes. It is designed to reduce the memory footprint after a wide spectrum of loop tr...
The growing dominance of wire delays at future technology points renders a microprocessor communication-bound. Clustered microarchitectures allow most dependence chains to execute...
Given the scale of massively parallel systems, occurrence of faults is no longer an exception but a regular event. Periodic checkpointing is becoming increasingly important in the...