Irregular parallel algorithms pose a significant challenge for achieving high performance because of the difficulty predicting memory access patterns or execution paths. Within an...
Replication is one of the prominent approaches for obtaining fault tolerance. Implementing replication on commodity hardware and in a transparent fashion, i.e., without changing t...
-- A hardware fault tolerance scheme for large multicomputers executing time-consuming non-interactive applications is described. Error detection and recovery are done mostly by so...