Abstract—Multi-core systems are now extremely common in modern clusters. In the past commodity systems may have had up to two or four CPUs per compute node. In modern clusters, these systems still have the same number of CPUs, however, these CPUs have moved from singlecore to quad-core and further advances are imminent. To obtain the best performance, compute nodes in a cluster are connected with high-performance interconnects. On nearly all clusters, the number of network interfaces is the same on current multi-core systems as in the past when there were fewer cores per node. Although these networks have increased bandwidth with the shift to multicore, there still exists severe network contention for some application patterns. In this work we propose mixed workload (non-exclusive) scheduling of jobs to increase network efficiency and reduce contention. As a case-study we use Message Passing Interface (MPI) programs on the InfiniBand interconnect. We show through detailed profilin...
Matthew J. Koop, Miao Luo, Dhabaleswar K. Panda