Memory system bottlenecks limit performance for many applications, and computations with strided access patterns are among the hardest hit. The streams used in such applications h...
We present an approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of every statement in a loop ...
The rapid advancements of networking technology have boosted potential bandwidth to the point that the cabling is no longer the bottleneck. Rather, the bottlenecks lie at the cros...
Patrick Crowley, Marc E. Fiuczynski, Jean-Loup Bae...
The Midimew network is an excellent contender for implementing the communication subsystem of a high performance computer. This network is an optimal 2D topology in the sense ther...
This paper performs a comprehensive investigation of dynamic selection for long atomic traces. It introduces a classification of trace selection methods and discusses existing and...