The load/store queue (LSQ) is one of the most complex parts of contemporary processors. Its latency is critical for the processor performance and it is usually one of the processo...
Moore’s Law and the drive towards performance efficiency have led to the on-chip integration of general-purpose cores with special-purpose accelerators. Pangaea is a heterogeneo...
Henry Wong, Anne Bracy, Ethan Schuchman, Tor M. Aa...
Atomic blocks allow programmers to delimit sections of code as ‘atomic’, leaving the language’s implementation to enforce atomicity. Existing work has shown how to implement...
Timothy L. Harris, Mark Plesko, Avraham Shinnar, D...
As Chip Multiprocessor (CMP) has become the mainstream in processor architectures, Intel and AMD have introduced their dual-core processors to the PC market. In this paper, perfor...
Lu Peng, Jih-Kwon Peir, Tribuvan K. Prakash, Yen-K...
Modern computer systems are called on to deal with billions of events every second, whether they are instructions executed, memory locations accessed, or packets forwarded. This p...