Clusters of commodity machines have become a popular way of building cheap high performance parallel computers. Many of these designs rely on standard Ethernet networks as a system interconnect. We have profiled the performance of some standard message passing communication on commodity clusters using MPIBench, a tool for benchmarking the performance of MPI routines that uses a highly accurate, globally synchronised clock. The results suggest that existing methodologies of performance characterisation are inadequate. Tests were performed on two clusters, one with a conventional network architecture of switches connected via a high bandwidth backbone, the other with a tetrahedral network topology that potentially provides for lower contention and higher bandwidth. Where packet loss does not occur, performance in either system is good and degrades smoothly with load. However, packet loss is found to occur at any load and the consequent invocation of the TCP/IP timeout and congestion co...
Francis Vaughan, Duncan A. Grove, Paul D. Coddingt