Decreasing hardware reliability is expected to impede the exploitation of increasing integration projected by Moore's Law. There is much ongoing research on efficient fault t...
Man-Lap Li, Pradeep Ramachandran, Ulya R. Karpuzcu...
This paper presents a new conceptual model, the XBWModel. Distributed computing is becoming a cost effective way to implement safety critical control systems. To support the devel...
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...
We define reactive simulatability for general asynchronous systems. Roughly, simulatability means that a real system implements an ideal system (specification) in a way that pre...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...