Improving cache locality for thread-level speculation

15 years 11 months ago

Download www.eecg.toronto.edu

With the advent of chip-multiprocessors (CMPs), Thread-Level Speculation (TLS) remains a promising technique for exploiting this highly multithreaded hardware to improve the performance of an individual program. However, with such speculatively-parallel execution the cache locality once enjoyed by the original uniprocessor execution is signiﬁcantly disrupted: for TLS execution on a four-processor CMP, we ﬁnd that the data-cache miss rates are nearly four-times those of the uniprocessor case, even though TLS execution utilizes four private data caches (i.e., four-fold greater cache capacity). We break down the TLS cache locality problem into instruction and data cache, execution stages, and parallel access patterns, and propose methods to improve cache locality in each of these areas. We ﬁnd that for parallel regions across 13 SPECint applications our simple and low-cost techniques reduce data-cache misses by 38%, improve performance by 12.8%, and signiﬁcantly improve scalabili...

Stanley L. C. Fung, J. Gregory Steffan

Real-time Traffic

Cache Locality | Distributed And Parallel Computing | IPPS 2006 | TLS Cache Locality | TLS Execution |

claim paper

» Speculative prefetching of optional locks in distributed systems

» An Adaptive Cache Coherence Protocol Optimized for ProducerConsumer Sharing

» An Advanced Optimizer for the IA64 Architecture

Post Info
More Details (n/a)

Added	12 Jun 2010
Updated	12 Jun 2010
Type	Conference
Year	2006
Where	IPPS
Authors	Stanley L. C. Fung, J. Gregory Steffan

Comments (0)

Sciweavers

Improving cache locality for thread-level speculation

Cache Locality | Distributed And Parallel Computing | IPPS 2006 | TLS Cache Locality | TLS Execution |

Explore & Download

Productivity Tools

Sciweavers