This paper describes a portable benchmark suite that assesses the ability of cluster networking hardware and software to overlap MPI communication and computation. The Communication Offload MPI-based Benchmark, or COMB, uses two methods to characterize the ability of messages to make progress concurrently with computational processing on the host processor(s). COMB measures the relationship between MPI communication bandwidth and host CPU availability.
William Lawry, Christopher Wilson, Arthur B. Macca