A model for shared-memory systems commonly (and often implicitly) assumed by programmers is that of sequential consistency. For implementing sequential consistency in a cache-based system, it is widely believed that (1) implementing strong ordering is sufficient and (2) restricting a processor to one sharedmemory reference at a time is practically necessary. In this paper we show that both beliefs are false. First, we prove that (1) is false with a counter-example. Second, we argue that (2) is false by giving sufficient conditions and an implementation that allow a processor to have simultaneous incomplete shared-memory references. While we do not demonstrate that this implementation is superior, we do believe it is practical and worthy of consideration.
Sarita V. Adve, Mark D. Hill