Abstract. This work presents new algorithms for the "Do-All" problem that consists of performing t tasks reliably in a message-passing synchronous system of p fault-prone processors. The algorithms are based on an aggressive coordination paradigm in which multiple coordinators may be active as the result of failures. The first algorithm is tolerant of f < p stop-failures and it does not allow restarts. It has the available processor steps complexity S = O((t + plogp/loglogp), log f) and the message complexity M = O(t + plogp/loglogp + f •p). Unlike prior solutions, our algorithm uses redundant broadcasts when encountering failures and, for large f, it has better S complexity. This algorithm is used as the basis for another algorithm which tolerates any pattern of stop-failures and restarts. This new algorithm is the first solution for the Do-All problem that efficiently deals with processor restarts. Its available processor steps complexity is S = O((t + p logp + f). rai...
Bogdan S. Chlebus, Roberto De Prisco, Alexander A.