As the number of devices available per chip continues to increase, the computational potential of future computer architectures grows likewise. While this is a clear benefit for f...
Over the last twenty years, the open source community has provided more and more software on which the world’s High Performance Computing (HPC) systems depend for performance ...
Jack Dongarra, Peter H. Beckman, Terry Moore, Patr...
Transactional memories are typically speculative and rely on contention managers to cure conflicts. This paper explores a complementary approach that prevents conflicts by schedul...
Aleksandar Dragojevic, Rachid Guerraoui, Anmol V. ...
Allowing loads to issue out-of-order with respect to earlier unresolved store addresses is very important for extracting parallelism in large-window superscalar processors. Blindl...
Inherent within complex instruction set architectures such as x86 are inefficiencies that do not exist in a simpler ISAs. Modern x86 implementations decode instructions into one o...
Brian Slechta, David Crowe, Brian Fahs, Michael Fe...