Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
The primary-backup replication model is one of the commonly adopted approaches to providing fault tolerant data services. Its extension to the real-time environment, however, impo...
This paper presents the design and implementation of the Distributed Autonomous Replication Management (DARM) framework built on top of the Spread group communication system. The ...
Generally, applications employing Database Management Systems (DBMS) require that the integrity of the data stored in the database be preserved during normal operation as well as ...
Portable systems such as cell phones and portable media players commonly use non-volatile RAM (NVRAM) to hold all of their data and metadata, and larger systems can store metadata...