This paper shows how a state-of-the-art software distributed shared-memory (DSM) protocol can be efficiently extended to tolerate single-node failures. In particular, we extend a ...
In recent years the application space of reconfigurable devices has grown to include many platforms with a strong need for fault tolerance. While these systems frequently contain ...
Many existing clusters use inexpensive Gigabit Ethernet and often have multiple interfaces cards to improve bandwidth and enhance fault tolerance. We investigate the use of Concurr...
Brad Penoff, Mike Tsai, Janardhan R. Iyengar, Alan...