Scalable busy-wait synchronization algorithms are essential for achieving good parallel program performance on large scale multiprocessors. Such algorithms include mutual exclusion locks, reader-writer locks, and barrier synchronization. Unfortunately, scalable synchronization algorithms are particularly sensitive to the effects of multiprogramming: their performance degrades sharply when processors are shared among different applications, or even among processes of the same application. In this paper we describe the design and evaluation of scalable scheduler-conscious mutual exclusion locks, reader-writer locks, and barriers, and show that by sharing information across the kernel/application interface we can improve the performance of scheduler-oblivious implementations by more than an order of magnitude.
Robert W. Wisniewski, Leonidas I. Kontothanassis,