Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
Parallel computing on volatile distributed resources requires schedulers that consider job and resource characteristics. We study unconventional computing environments containing ...
Brent Rood, Nathan Gnanasambandam, Michael J. Lewi...
Device and interconnect fabrics at the nanoscale will have a density of defects and susceptibility to transient faults far exceeding those of current silicon technologies. In this...
Andrey V. Zykov, Elias Mizan, Margarida F. Jacome,...
Current petascale systems have tens of thousands of hardware components and complex system software stacks, which increase the probability of faults occurring during the lifetime ...
: Weprove tight bounds on the time needed to solve k-set agreement, a natural generalization of consensus. We analyze this problem in a synchronous, message-passing model where pro...
Soma Chaudhuri, Maurice Herlihy, Nancy A. Lynch, M...