Multi-party communication complexity involves distributed computation of a function over inputs held by multiple distributed players. A key focus of distributed computing research, since the very beginning, has been to tolerate crash failures. It is thus natural to ask “If we want to compute a certain function in a fault-tolerant way, what will the communication complexity be?” This natural question, interestingly, has not been formally posed and thoroughly studied prior to this work. Whether fault-tolerant communication complexity is interesting to study largely depends on how big a difference failures make. This paper proves that the impact of failures is significant, at least for the SUM aggregation function in general networks: As our central contribution, we prove that there exists (at least) an exponential gap between the non-fault-tolerant and fault-tolerant communication complexity of SUM. Our results also imply the optimality (within polylog factors) of some recent fault...
Binbin Chen, Haifeng Yu, Yuda Zhao, Phillip B. Gib