Overlapping dependent loads with addressless preload

14 years 6 months ago

Download www.cs.virginia.edu

Modern out-of-order processors with non-blocking caches exploit Memory-Level Parallelism (MLP) by overlapping cache misses in a wide instruction window. The exploitation of MLP, however, can be limited due to long-latency operations in producing the base address of a cache miss load. When the parent instruction is also a cache miss load, a serialization of the two loads must be enforced to satisfy the load-load data dependence. In this paper, we propose a mechanism that dynamically captures the load-load data dependences at runtime. A special Preload is issued in place of the dependent load without waiting for the parent load, thus effectively overlapping the two loads. The Preload provides necessary information for the memory controller to calculate the correct memory address upon the availability of the parent’s data to eliminate any interconnect delay between the two loads. Performance evaluations based on SPEC2000 and Olden applications show that significant speedups up to 40% w...

Zhen Yang, Xudong Shi, Feiqi Su, Jih-Kwon Peir

Real-time Traffic

Cache Miss Load | Hardware | IEEEPACT 2006 | Load-load Data Dependences | Memory-Level Parallelism |

claim paper

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	IEEEPACT
Authors	Zhen Yang, Xudong Shi, Feiqi Su, Jih-Kwon Peir

Comments (0)

Sciweavers

Overlapping dependent loads with addressless preload

Cache Miss Load | Hardware | IEEEPACT 2006 | Load-load Data Dependences | Memory-Level Parallelism |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers