Tuning parallel code can be a time-consuming and difficult task. We present our approach to automate the performance analysis of OpenMP applications that is based on the notion of ...
—Large-scale GPU clusters are gaining popularity in the scientific computing community. However, their deployment and production use are associated with a number of new challenge...
Volodymyr V. Kindratenko, Jeremy Enos, Guochun Shi...
Data access in HPC infrastructures is realized via user-level networking and OS-bypass techniques through which nodes can communicate with high bandwidth and low-latency. Virtualiz...
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
In this paper the Northrhine-Westphalian metacomputing initiative is described. We start by discussing various general aspects of metacomputing and explain the reasons for founding...