In order to reduce the overhead of synchronizing operations of shared memory multiprocessors, this paper proposes a mechanism, named specMEM, to execute memory accesses following ...
Scalability of applications on distributed sharedmemory (DSM) multiprocessors is limited by communication overheads. At some point, using more processors to increase parallelism y...
Khaled Z. Ibrahim, Gregory T. Byrd, Eric Rotenberg
This paper describes a new approach to finding performance bottlenecks in shared-memory parallel programs and its embodiment in the Paradyn Parallel Performance Tools running with...
Future CMPs will combine many simple cores with deep cache hierarchies. With more cores, cache resources per core are fewer, and must be shared carefully to avoid poor utilization...
Junli Gu, Steven S. Lumetta, Rakesh Kumar, Yihe Su...
Detecting data races in parallel programs is important for both software development and production-run diagnosis. Recently, there have been several proposals for hardware-assiste...