Sciweavers

MICRO
2005
IEEE
163views Hardware» more  MICRO 2005»
14 years 2 months ago
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
As more data value speculation mechanisms are being proposed to speed-up processors, there is growing pressure on the critical processor structures that must buffer the state of t...
Smruti R. Sarangi, Wei Liu, Yuanyuan Zhou
MICRO
2005
IEEE
123views Hardware» more  MICRO 2005»
14 years 2 months ago
A Criticality Analysis of Clustering in Superscalar Processors
Clustered machines partition hardware resources to circumvent the cycle time penalties incurred by large, monolithic structures. This partitioning introduces a long inter-cluster ...
Pierre Salverda, Craig B. Zilles
MICRO
2005
IEEE
117views Hardware» more  MICRO 2005»
14 years 2 months ago
A Quantum Logic Array Microarchitecture: Scalable Quantum Data Movement and Computation
Recent experimental advances have demonstrated technologies capable of supporting scalable quantum computation. A critical next step is how to put those technologies together into...
Tzvetan S. Metodi, Darshan D. Thaker, Andrew W. Cr...
MICRO
2005
IEEE
140views Hardware» more  MICRO 2005»
14 years 2 months ago
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Data prefetching via helper threading has been extensively investigated on Simultaneous MultiThreading (SMT) or Virtual Multi-Threading (VMT) architectures. Although reportedly la...
Jiwei Lu, Abhinav Das, Wei-Chung Hsu, Khoa Nguyen,...
MICRO
2005
IEEE
136views Hardware» more  MICRO 2005»
14 years 2 months ago
Automatic Thread Extraction with Decoupled Software Pipelining
Until recently, a steadily rising clock rate and other uniprocessor microarchitectural improvements could be relied upon to consistently deliver increasing performance for a wide ...
Guilherme Ottoni, Ram Rangan, Adam Stoler, David I...
MICRO
2005
IEEE
108views Hardware» more  MICRO 2005»
14 years 2 months ago
How to Fake 1000 Registers
Large numbers of logical registers can improve performance by allowing fast access to multiple subroutine contexts (register windows) and multiple thread contexts (multithreading)...
David W. Oehmke, Nathan L. Binkert, Trevor N. Mudg...
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
14 years 2 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
MICRO
2005
IEEE
139views Hardware» more  MICRO 2005»
14 years 2 months ago
Shader Performance Analysis on a Modern GPU Architecture
This paper presents an analysis of the performance of the shader processing units in a modern Graphics Processor Unit (GPU) architecture using real graphic applications. The archi...
Victor Moya Del Barrio, Carlos González, Jo...
MICRO
2005
IEEE
113views Hardware» more  MICRO 2005»
14 years 2 months ago
Thermal Management of On-Chip Caches Through Power Density Minimization
Various architectural power reduction techniques have been proposed for on-chip caches in the last decade. In this paper, we first show that these power reduction techniques can b...
Ja Chun Ku, Serkan Ozdemir, Gokhan Memik, Yehea I....