Tolerating Branch Predictor Latency on SMT

15 years 8 months ago

Download ics07.ac.upc.edu

Abstract. Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the ﬂow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with the branch predictor delay on SMT. Our contribution is two-fold: we describe a decoupled implementation of the SMT fetch unit, and we propose an interthread pipelined branch predictor implementation. These techniques prove to be eﬀective for tolerating the branch predictor access latency.

Ayose Falcón, Oliverio J. Santana, Alex Ram

Real-time Traffic

Branch Predictor | Branch Predictor Delay | Distributed And Parallel Computing | ISHPC 2003 | Predictors Prevent Smt |

claim paper

» Tuning Blocked Array Layouts to Exploit Memory Hierarchy in SMT Architectures

Post Info
More Details (n/a)

Added	07 Jul 2010
Updated	07 Jul 2010
Type	Conference
Year	2003
Where	ISHPC
Authors	Ayose Falcón, Oliverio J. Santana, Alex Ramírez, Mateo Valero

Comments (0)

Sciweavers

Tolerating Branch Predictor Latency on SMT

Branch Predictor | Branch Predictor Delay | Distributed And Parallel Computing | ISHPC 2003 | Predictors Prevent Smt |

Explore & Download

Productivity Tools

Sciweavers