Shared last-level TLBs for chip multiprocessors

13 years 4 months ago

Download www.cs.rutgers.edu

Translation Lookaside Buffers (TLBs) are critical to processor performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as chip multiprocessors (CMPs) become ubiquitous, TLB design must be re-evaluated. This paper is the ﬁrst to propose and evaluate shared last-level (SLL) TLBs as an alternative to the commercial norm of private, per-core L2 TLBs. SLL TLBs eliminate 7-79% of system-wide misses for parallel workloads. This is an average of 27% better than conventional private, per-core L2 TLBs, translating to notable runtime gains. SLL TLBs also provide beneﬁts comparable to recently-proposed Inter-Core Cooperative (ICC) TLB prefetchers, but with considerably simpler hardware. Furthermore, unlike these prefetchers, SLL TLBs can aid sequential applications, eliminating 35-95% of the TLB misses for various multiprogrammed combinations of sequential applications. This corresponds to a 21% average increase in TLB miss eliminations ...

Abhishek Bhattacharjee, Daniel Lustig, Margaret Ma

Real-time Traffic

Distributed And Parallel Computing | HPCA 2011 | Per-core L2 Tlbs | Sequential Applications | SLL TLBs |

claim paper

Post Info
More Details (n/a)

Added	20 Aug 2011
Updated	20 Aug 2011
Type	Journal
Year	2011
Where	HPCA
Authors	Abhishek Bhattacharjee, Daniel Lustig, Margaret Martonosi

Comments (0)

Sciweavers

Shared last-level TLBs for chip multiprocessors

Distributed And Parallel Computing | HPCA 2011 | Per-core L2 Tlbs | Sequential Applications | SLL TLBs |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers