Sciweavers

41 search results - page 7 / 9
» The Latency Hiding Effectiveness of Decoupled Access Execute...
Sort
View
ARCS
2009
Springer
14 years 1 months ago
Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
COMPUTING
2004
204views more  COMPUTING 2004»
13 years 7 months ago
Image Registration by a Regularized Gradient Flow. A Streaming Implementation in DX9 Graphics Hardware
The presented image registration method uses a regularized gradient flow to correlate the intensities in two images. Thereby, an energy functional is successively minimized by des...
Robert Strzodka, Marc Droske, Martin Rumpf
ISCA
2002
IEEE
91views Hardware» more  ISCA 2002»
14 years 8 days ago
Slack: Maximizing Performance Under Technological Constraints
Many emerging processor microarchitectures seek to manage technological constraints (e.g., wire delay, power, and circuit complexity) by resorting to nonuniform designs that provi...
Brian A. Fields, Rastislav Bodík, Mark D. H...
ICCD
2007
IEEE
132views Hardware» more  ICCD 2007»
14 years 4 months ago
A position-insensitive finished store buffer
This paper presents the Finished Store Buffer (or FSB), an alternative and position-insensitive approach for building a scalable store buffer for an out-of-order processor. Exploi...
Erika Gunadi, Mikko H. Lipasti
EUROPAR
2000
Springer
13 years 11 months ago
On the Performance of Fetch Engines Running DSS Workloads
Abstract This paper examines the behavior of current and next generation microprocessors' fetch engines while running Decision Support Systems (DSS) workloads. We analyze the ...
Carlos Navarro, Alex Ramírez, Josep-Lluis L...