Abstract. As heterogeneous computing platforms become more prevalent, the programmer must account for complex memory hierarchies in addition to the difficulties of parallel program...
Kyle Spafford, Jeremy S. Meredith, Jeffrey S. Vett...
The advent of Cloud computing platforms, and the growing pervasiveness of Multicore processor architectures have revealed the inadequateness of traditional programming models base...
In this work we investigate how the compiler technique of message strip mining performs in practice on contemporary high performance networks. Message strip mining attempts to redu...
Given the large communication overheads characteristic of modern parallel machines, optimizations that eliminate, hide or parallelize communication may improve the performance of ...
Abstract— We developed an automated environment to measure the memory access behavior of applications on high performance clusters. Code optimization for processor caches is cruc...