For many programs, especially integer codes, untolerated load instruction latencies account for a significant portion of total execution time. In this paper, we present the desig...
Todd M. Austin, Dionisios N. Pnevmatikatos, Gurind...
Whole page relevance defines how well the surface-level representation of all elements on a search result page and the corresponding holistic attributes of the presentation respon...
Peter Bailey, Nick Craswell, Ryen W. White, Liwei ...
Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expa...
Theo Kluter, Philip Brisk, Paolo Ienne, Edoardo Ch...
Due to the advent of multi-core processors and the consequent need for better concurrent programming abstractions, new synchronization paradigms have emerged. A promising one, kno...
Felipe Goldstein, Alexandro Baldassin, Paulo Cento...
Communication misses--those serviced by dirty data in remote caches--are a pressing performance limiter in shared-memory multiprocessors. Recent research has indicated that tempor...