Reducing Ownership Overhead for Load-Store Sequences in Cache-Coherent Multiprocessors

15 years 11 months ago

Download ipdps.cc.gatech.edu

Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisitions at writes to shared data. This can have a signiﬁcant impact on performance in a cache-coherent non-uniform memory architecture (NUMA) multiprocessor. By combining a read-request and an ownership acquisition, the write latency and network trafﬁc can potentially be reduced. In this paper we propose a new hardware-based approach for performing this optimization by targeting load-store sequences, which we show is a super-set of migratory sharing. A loadstore sequence consists of a global read request followed by a global write action to the same memory location from the same processor, without any interveaning access to the same block from any other processor. We use detailed simulation with four benchmark programs including one on-line transaction processing (OLTP) workload and operating system execution...

Jim Nilsson, Fredrik Dahlgren

Real-time Traffic

Cache-coherent Non-uniform Memory | Distributed And Parallel Computing | IPPS 2000 | Ownership Acquisition | Protocol Create Ownership |

claim paper

Post Info
More Details (n/a)

Added	31 Jul 2010
Updated	31 Jul 2010
Type	Conference
Year	2000
Where	IPPS
Authors	Jim Nilsson, Fredrik Dahlgren

Comments (0)

Sciweavers

Reducing Ownership Overhead for Load-Store Sequences in Cache-Coherent Multiprocessors

Cache-coherent Non-uniform Memory | Distributed And Parallel Computing | IPPS 2000 | Ownership Acquisition | Protocol Create Ownership |

Explore & Download

Productivity Tools

Sciweavers