Today’s rapid development of supercomputers has caused I/O performance to become a major performance bottleneck for many scientific applications. Trace analysis tools have thus...
Xiaoqing Luo, Frank Mueller, Philip H. Carns, John...
Migrating resources is a useful tool for balancing load in a distributed system, but it is difficult to determine when to move resources, where to move resources, and how much of ...
Michael A. Sevilla, Noah Watkins, Carlos Maltzahn,...
Solving the AllPairs similarity search problem entails finding all pairs of vectors in a high dimensional sparse dataset that have a similarity value higher than a given threshol...
Building MapReduce applications using the Message-Passing Interface (MPI) enables us to exploit the performance of large HPC clusters for big data analytics. However, due to the l...
Over the past few years, the increasing amounts of data produced by large-scale simulations have motivated a shift from traditional offline data analysis to in situ analysis and v...
Matthieu Dorier, Matthieu Dreher, Tom Peterka, Jus...
The programming language Python is widely used to create rapidly compact software. However, compared to low-level programming languages like C or Fortran low performance is preven...
The Oak Ridge Leadership Computing Facility (OLCF) is a leader in large-scale parallel file system development, design, deployment and continuous operation. For the last decade, ...
Raghul Gunasekaran, Sarp Oral, Jason Hill, Ross Mi...
The last decade has seen power consumption move from an afterthought to the foremost design constraint of new supercomputers. Measuring the power of a supercomputer can be a daunt...
Relative debugging traces software errors by comparing two executions of a program concurrently - one code being a reference version and the other faulty. Relative debugging is pa...