The massive data volumes acquired, simulated, processed and analyzed by globally distributed scientific collaborations continue to grow exponentially. One leading example is the L...
Harvey B. Newman, Azher Mughal, Dorian Kcira, Iosi...
Considering the unique characteristics of storage class memory (SCM), such as non-volatility, fast access speed, byteaddressability, low-energy consumption, and in-place modifica...
Lingfang Zeng, Binbing Hou, Dan Feng, Kenneth B. K...
Task-based scheduling has emerged as one solution to the complexity of parallel computing. When using these tools, developers must frame their computation as a series of tasks wit...
Blake Haugen, Stephen Richmond, Jakub Kurzak, Chad...
In recent years, systems researchers have devoted considerable effort to the study of large-scale graph processing. Existing distributed graph processing systems such as Pregel, ...
This paper discusses a successful story of introducing High Performance Computing (HPC) concepts in an engineering curriculum over a period of the last 6 academic years at various...
In the last decade, GPUs have emerged to be widely adopted for general-purpose applications. To capture on-chip locality for these applications, modern GPUs have integrated multil...
Accurate analysis of HPC storage system designs is contingent on the use of I/O workloads that are truly representative of expected use. However, I/O analyses are generally bound ...
Shane Snyder, Philip H. Carns, Robert Latham, Misb...
The ability to record and replay program execution helps significantly in debugging non-deterministic MPI applications by reproducing message-receive orders. However, the large a...