HPC programmers utilize tracefiles, which record program behavior in great detail, as the basis for many performance analysis activities. The lack of generally accessible tracefil...
Ken Ferschweiler, Scott Harrah, Dylan Keon, Mariac...
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
Abstract. One of the most important collective communication patterns for scientific applications is the many to many, also called complete exchange. Although efficient All-to-All...
Abstract-- Multicore microprocessors have been largely motivated by the diminishing returns in performance and the increased power consumption of single-threaded ILP microprocessor...
Matthew Curtis-Maury, Karan Singh, Sally A. McKee,...
Abstract—Emerging 64bitOS’s supply a huge amount of memory address space that is essential for new applications using very large data. It is expected that the memory in connect...