We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and...
Michael Bauer, John Clark, Eric Schkufza, Alex Aik...
Recent studies of irregular applications such as finite-element mesh generators and data-clustering codes have shown that these applications have a generalized data parallelism ar...
Traditional parallel compilers do not effectively parallelize irregular applications because they contain little looplevel parallelism due to ambiguous memory references. We explo...
Because irregular applications have unpredictable memory access patterns, their performance is dominated by memory behavior. The Impulse con gurable memory controller will enable s...
John B. Carter, Wilson C. Hsieh, Mark R. Swanson, ...
The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming to distributed-memory platforms by automatic translation of OpenMP programs to ...
Until recently, parallel programming has largely focused on the exploitation of data-parallelism in dense matrix programs. However, many important application domains, including m...
Milind Kulkarni, Martin Burtscher, Calin Cascaval,...