Load Miss Prediction - Exploiting Power Performance Trade-offs

14 years 8 months ago

Download www.cecs.uci.edu

— Modern CPUs operate at GHz frequencies, but the latencies of memory accesses are still relatively large, in the order of hundreds of cycles. Deeper cache hierarchies with larger cache sizes can mask these latencies for codes with good data locality and reuse, such as structured dense matrix computations. However, cache hierarchies do not necessarily beneﬁt sparse scientiﬁc computing codes, which tend to have limited data locality and reuse. We therefore propose a new memory architecture with a Load Miss Predictor (LMP), which includes a data bypass cache and a predictor table, to reduce access latencies by determining whether a load should bypass the main cache hierarchy and issue an early load to main memory. Our architecture uses the L2 (and lower caches) as a victim cache for data removed from our bypass cache. We use cycleaccurate simulations, with SimpleScalar and Wattch to show that our LMP improves the performance of sparse codes, our application domain of interest, on a...

Konrad Malkowski, Greg M. Link, Padma Raghavan, Ma

Real-time Traffic

Bypass Cache | Cache | Cache Hierarchies | Distributed And Parallel Computing | IPPS 2007 |

claim paper

Post Info
More Details (n/a)

Added	03 Jun 2010
Updated	03 Jun 2010
Type	Conference
Year	2007
Where	IPPS
Authors	Konrad Malkowski, Greg M. Link, Padma Raghavan, Mary Jane Irwin

Comments (0)

Sciweavers

Load Miss Prediction - Exploiting Power Performance Trade-offs

Bypass Cache | Cache | Cache Hierarchies | Distributed And Parallel Computing | IPPS 2007 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers