Fault tolerance is an important issue for large machines with tens or hundreds of thousands of processors. Checkpoint-based methods, currently used on most machines, rollback all ...
Distributed systems are becoming a popular way of implementing many embedded computing applications, automotive control being a common and important example. Such embedded systems...
We present an adaptive fault-tolerant wormhole routing algorithm for 2D meshes. The main feature is that with the algorithm, a normal routing message, when blocked by some faulty ...
A new fault tolerant architecture that provides tolerance to a broad scope of hardware, software, and communications faults is being developed. This architecture relies on widely ...