Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. ...
The backends of today’s Internet services rely heavily on caching at various layers both to provide faster service to common requests and to reduce load on back-end components. ...
Alexander Rasmussen, Emre Kiciman, V. Benjamin Liv...
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stre...
Vasanth Bala, Evelyn Duesterwald, Sanjeev Banerjia
Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behav...
Jeffrey Sheldon, Walter Lee, Ben Greenwald, Saman ...
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...