Parallel machines are typically space shared, or time shared such that only one application executes on a group of nodes at any given time. It is generally assumed that executing multiple parallel applications simultaneously on a group of independently scheduled nodes is not efficient because of synchronization requirements. The central contribution of this paper is to demonstrate that performance of parallel applications with sharing is typically competitive for independent and coordinated (gang) scheduling on small compute clusters. There is a modest overhead due to uncoordinated scheduling but it is often compensated by better sharing of resources. The impact of sharing was studied for different numbers of nodes and threads and different memory and CPU requirements of competing applications. The significance of the CPU time slice, a key parameter in CPU scheduling, was also studied. Application characteristics and operating system scheduling policies are identified as the main f...
M. Ghanesh, S. Kumar, Jaspal Subhlok