We present a scalable temporal order analysis technique that supports debugging of large scale applications by classifying MPI tasks based on their logical program execution order...
Dong H. Ahn, Bronis R. de Supinski, Ignacio Laguna...
—Massively parallel scientific applications, running on extreme-scale supercomputers, produce hundreds of terabytes of data per run, driving the need for storage solutions to im...
Ramya Prabhakar, Sudharshan S. Vazhkudai, Youngjae...
Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
While the IP unicast service has proven successful, extending end-to-end adaptation to multicast has been a difficult problem. Unlike the unicast case, multicast protocols must su...
In most large-scale peer-to-peer (P2P) applications, it is necessary to collect vital statistics data — sometimes referred to as logs — from up to millions of peers. Tradition...