In this work we investigate how the compiler technique of message strip mining performs in practice on contemporary high performance networks. Message strip mining attempts to reduce the overall cost of communication in parallel programs by breaking up large message transfers into smaller ones that can be overlapped with computation. In practice, however, network resource constraints may negate the expected performance gains. By deriving a performance model and synthetic benchmarks we determine how network and application characteristics influence the applicability of this optimization. We use these findings to determine heuristics to follow when performing this optimization on parallel programs. We propose strip mining with variable block size as an alternative strategy that performs almost as well as a highly tuned fixed block strategy and has the advantage of being performance portable across systems and application input sets. We evaluate both techniques using synthetic benchmar...