Due to cost, time, and flexibility constraints, simulators are often used to explore the design space when developing a new processor architecture, as well as when evaluating the ...
Inherent within complex instruction set architectures such as x86 are inefficiencies that do not exist in a simpler ISAs. Modern x86 implementations decode instructions into one o...
Brian Slechta, David Crowe, Brian Fahs, Michael Fe...
Predicated Execution can be used to alleviate the costs associated with frequently mispredicted branches. This is accomplished by trading the cost of a mispredicted branch for exe...
Originally developed to connect processors and memories in multicomputers, prior research and design of interconnection networks have focused largely on performance. As these netw...
Several manufacturers have recently announced the first simultaneous-multithreaded processors, both as single CPUs and as components of multi-CPU chips. All are small scale, compr...
With the scaling of technology and the need for higher performance and more functionality, power dissipation is becoming a major bottleneck for microprocessor designs. Pipeline ba...
Hai Li, Swarup Bhunia, Yiran Chen, T. N. Vijaykuma...
This paper identifies node affinity as an important property for scalable general-purpose locks. Nonuniform communication architectures (NUCAs), for example CCNUMAs built from a f...
We consider the impact of different communication architectures on the performability (performance + availability) of cluster-based servers. In particular, we use a combination of ...