In this paper, we analyze restrictions of traditional communication performance models affecting the accuracy of analytical prediction of the execution time of collective communication operations on homogeneous and heterogeneous clusters. In particular, we show that the constant and variable contributions of processors and network are not fully separated in these models. Full separation of the contributions that have different nature and arise from different sources would lead to more intuitive and accurate models, but the parameters of such models cannot be estimated from only the point-to-point experiments, which are usually used for traditional models. The paper presents such an intuitive and accurate point-to-point model and describes a set of communication experiments sufficient for estimation of its parameters. It also presents an implementation of the new model in the form of a software tool that automates the estimation of both this model and heterogeneous extensions of tradit...
Alexey L. Lastovetsky, Vladimir Rychkov, Maureen O