Sciweavers

ISCA
1995
IEEE
121views Hardware» more  ISCA 1995»
14 years 3 months ago
A Comparative Analysis of Schemes for Correlated Branch Prediction
Modern high-performance architectures require extremely accurate branch prediction to overcome the performance limitations of conditional branches. We present a framework that cat...
Cliff Young, Nicholas C. Gloy, Michael D. Smith
ISCA
1995
IEEE
98views Hardware» more  ISCA 1995»
14 years 3 months ago
Instruction Fetching: Coping with Code Bloat
Previous research has shown that the SPEC benchmarks achieve low miss ratios in relatively small instruction caches. This paper presents evidence that current software-development...
Richard Uhlig, David Nagle, Trevor N. Mudge, Stuar...
ISCA
1995
IEEE
93views Hardware» more  ISCA 1995»
14 years 3 months ago
Optimizing Memory System Performance for Communication in Parallel Computers
Communicationin aparallel systemfrequently involvesmoving data from the memory of one node to the memory of another; this is the standard communication model employedin message pa...
Thomas Stricker, Thomas R. Gross
ISCA
1995
IEEE
110views Hardware» more  ISCA 1995»
14 years 3 months ago
Instruction Cache Fetch Policies for Speculative Execution
Current trends in processor design are pointing to deeper and wider pipelines and superscalar architectures. The efficient use of these resources requires speculative execution, ...
Dennis Lee, Jean-Loup Baer, Brad Calder, Dirk Grun...
ISCA
1995
IEEE
147views Hardware» more  ISCA 1995»
14 years 3 months ago
Dynamic Self-Invalidation: Reducing Coherence Overhead in Shared-Memory Multiprocessors
This paper introduces dynamic self-invalidation (DSI), a new technique for reducing cache coherence overhead in shared-memory multiprocessors. DSI eliminates invalidation messages...
Alvin R. Lebeck, David A. Wood
ISCA
1995
IEEE
92views Hardware» more  ISCA 1995»
14 years 3 months ago
A Comparison of Full and Partial Predicated Execution Support for ILP Processors
One can e ectively utilize predicated execution to improve branch handling in instruction-level parallel processors. Although the potential bene ts of predicated execution are hig...
Scott A. Mahlke, Richard E. Hank, James E. McCormi...
ISCA
1995
IEEE
118views Hardware» more  ISCA 1995»
14 years 3 months ago
The EM-X Parallel Computer: Architecture and Basic Performance
Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor ...
Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Ha...
ISCA
1995
IEEE
133views Hardware» more  ISCA 1995»
14 years 3 months ago
Performance Evaluation of the PowerPC 620 Microarchitecture
The PowerPC 620TM microprocessor1 is the most recent and performance leading member of the PowerPCTM family. The 64-bit PowerPC 620 microprocessor employs a two-phase branch predi...
Trung A. Diep, Christopher Nelson, John Paul Shen
ISCA
1995
IEEE
109views Hardware» more  ISCA 1995»
14 years 3 months ago
Next Cache Line and Set Prediction
Accurate instruction fetch and branch prediction is increasingly important on today’s wide-issue architectures. Fetch prediction is the process of determining the next instructi...
Brad Calder, Dirk Grunwald
ISCA
1995
IEEE
110views Hardware» more  ISCA 1995»
14 years 3 months ago
Optimization of Instruction Fetch Mechanisms for High Issue Rates
Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be expl...
Thomas M. Conte, Kishore N. Menezes, Patrick M. Mi...