Traditional problems in distributed systems include the Reliable Broadcast, Distributed Consensus, and Distributed Firing Squad problems. These problems require coordination only among the processors that do not fail. In systems with benign processor failures, however, it is reasonable to require that a faulty processor's actions are consistent with those of nonfaulty processors, assuming that it performs any action at all. We consider problems requiring consistent, simultaneous coordination and analyze these problems in terms of common knowledge. (Others have performed similar analyses of traditional coordination problems [1,9].) In several failure models, we use our analysis to give round-optimal solutions. In one benign failure model, however, we show that such problems cannot be solved, even in failure-free executions.
Gil Neiger, Mark R. Tuttle