As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction of the system peak performance is con...
Abstract. Contemporary high-end Terascale and Petascale systems are composed of hundreds of thousands of commodity multi-core processors interconnected with high-speed custom netwo...
Heike Jagode, Jack Dongarra, Sadaf R. Alam, Jeffre...
—One-sided communication is important to enable asynchronous communication and data movement for Global Address Space (GAS) programming models. Such communication is typically re...
Xinyu Que, Weikuan Yu, Vinod Tipparaju, Jeffrey S....
With the ever-increasing numbers of cores per node on HPC systems, applications are increasingly using threads to exploit the shared memory within a node, combined with MPI across ...
This work presents a general methodology for estimating the performance of an HPC workload when running on a future hardware architecture. Further, it demonstrates the methodology...
Ilya Sharapov, Robert Kroeger, Guy Delamarter, Raz...