The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
The presented image registration method uses a regularized gradient flow to correlate the intensities in two images. Thereby, an energy functional is successively minimized by des...
Many emerging processor microarchitectures seek to manage technological constraints (e.g., wire delay, power, and circuit complexity) by resorting to nonuniform designs that provi...
This paper presents the Finished Store Buffer (or FSB), an alternative and position-insensitive approach for building a scalable store buffer for an out-of-order processor. Exploi...
Abstract This paper examines the behavior of current and next generation microprocessors' fetch engines while running Decision Support Systems (DSS) workloads. We analyze the ...