We present Dynamic Out-of-Order Java (DOJ), a dynamic parallelization approach. In DOJ, a developer annotates code blocks as tasks to decouple these blocks from the parent executi...
Yong Hun Eom, Stephen Yang, James Christopher Jeni...
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow in-depth analyses of communication and synchronization patterns...
Daniel Becker, John C. Linford, Rolf Rabenseifner,...
Many recent research efforts have been devoted to the use of commodity hardware for solving computationallyintensive scientific problems. Among such problems, hyperspectral imagi...
Javier Setoain, Christian Tenllado, Manuel Prieto,...
Matrix multiplication is an important kernel in linear algebra algorithms, and the performance of both serial and parallel implementations is highly dependent on the memory system...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memo...
Min Kyu Jeong, Doe Hyun Yoon, Dam Sunwoo, Mike Sul...