The probability of failures in software distributed shared memory (SDSM) increases as the system size grows. This paper introduces a new, efficient message logging technique, call...
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...
Advances in circuit and integration technology are continuously boosting the speed of microprocessors. One of the main challenges presented by such developments is the effective u...
Today, mobile smartphones are expected to be able to run the same complex, memory-intensive applications that were originally designed and coded for general-purpose processors. Ho...
— Heterogeneous Chip Multiprocessors (HMPs), such as the Cell Broadband Engine, offer a new design optimization opportunity by allowing designers to provide accelerators for appl...