Sciweavers

45 search results - page 4 / 9
» Supporting Reduced Location Management Overhead and Fault To...
Sort
View
ICS
2007
Tsinghua U.
14 years 1 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
SRDS
1998
IEEE
13 years 11 months ago
Optimization of a Real-Time Primary-Backup Replication Service
The primary-backup replication model is one of the commonly adopted approaches to providing fault tolerant data services. Its extension to the real-time environment, however, impo...
Hengming Zou, Farnam Jahanian
EDCC
2008
Springer
13 years 9 months ago
A Distributed Approach to Autonomous Fault Treatment in Spread
This paper presents the design and implementation of the Distributed Autonomous Replication Management (DARM) framework built on top of the Spread group communication system. The ...
Hein Meling, Joakim L. Gilje
PRDC
1999
IEEE
13 years 11 months ago
Cost of Ensuring Safety in Distributed Database Management Systems
Generally, applications employing Database Management Systems (DBMS) require that the integrity of the data stored in the database be preserved during normal operation as well as ...
Maitrayi Sabaratnam, Svein-Olaf Hvasshovd, Ø...
EMSOFT
2006
Springer
13 years 11 months ago
Reliability mechanisms for file systems using non-volatile memory as a metadata store
Portable systems such as cell phones and portable media players commonly use non-volatile RAM (NVRAM) to hold all of their data and metadata, and larger systems can store metadata...
Kevin M. Greenan, Ethan L. Miller