—In this paper, we analyze restrictions of traditional models affecting the accuracy of analytical prediction of the execution time of collective communication operations. In par...
Alexey L. Lastovetsky, Vladimir Rychkov, Maureen O...
This paper discusses a program synthesis system to facilitate the generation of high-performance parallel programs for a class of computations encountered in quantum chemistry and...
Gerald Baumgartner, David E. Bernholdt, Daniel Coc...
Large-scale distributed computing systems such as grids are serving a growing number of scientists. These environments bring about not only the advantages of an economy of scale, ...
Omer Ozan Sonmez, Nezih Yigitbasi, Alexandru Iosup...
Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point towards multi-threaded multi-core designs,...
Alex Shye, Tipp Moseley, Vijay Janapa Reddi, Josep...
Three protocols for gossip-based failure detection services in large-scale heterogeneous clusters are analyzed and compared. The basic gossip protocol provides a means by which fai...