In today’s high performance computing practice, fail-stop failures are often tolerated by checkpointing. While checkpointing is a very general technique and can often be applied...
Exception handling is one of the popular means used for improving dependability and supporting recovery in the ServiceOriented Architecture (SOA). This practical experience paper ...
Anatoliy Gorbenko, Alexander Romanovsky, Vyachesla...
An Operational Information System (OIS) supports a real-time view of an organization’s information critical to its logistical business operations. A central component of an OIS ...
To address the limitations of centralized shared storage for cloud computing, we are building Lithium, a distributed storage system designed specifically for virtualization workl...
Log-based recovery and replay systems are important for system reliability, debugging and postmortem analysis/recovery of malware attacks. These systems must incur low space and p...
Daniela A. S. de Oliveira, Jedidiah R. Crandall, G...