Symmetric Multiprocessors (SMPs), combined with modern interconnection technologies are commonly used to build cost-effective compute clusters. However, contention among processors for access to shared resources, as is the main memory bus and the NIC can limit their efficiency significantly. In this paper, we first provide an experimental demonstration of the effect of resource contention on the total execution time of applications. Then, we present the design and implementation of an informed gang-like scheduling algorithm aimed at improving the throughput of multiprogrammed workloads on clusters of SMPs. Our algorithm selects the processes to be coscheduled so as not to saturate nor underutilize the memory bus or network link bandwidth. Its input data are acquired dynamically using hardware monitoring counters and a modified Myrinet NIC firmware, without any modifications to existing application binaries. Experimental evaluation shows throughput can improve up to 40-48% compar...