SecondSite: disaster tolerance as a service

12 years 8 months ago

Download nss.cs.ubc.ca

This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability system by allowing groups of virtual machines to be replicated across data centers over wide-area Internet links. The goal of the system is to commodify the property of availability, exposing it as a simple tick box when conﬁguring a new virtual machine. To achieve this in the wide area, we have had to tackle the related issues of replication trafﬁc bandwidth, reliable failure detection across geographic regions and trafﬁc redirection over a wide-area network without compromising on transparency and consistency. Categories and Subject Descriptors D.4.5 [Operating Systems]: Reliability—Backup procedures, Checkpoint/restart, Fault-tolerance Keywords Wide Area Replication, Disaster Recovery

Shriram Rajagopalan, Brendan Cully, Ryan O'Connor,

Real-time Traffic

Disaster Tolerance | Failure Detection | Fault Tolerance | VEE 2012 | Virtualization |

claim paper

Post Info
More Details (n/a)

Added	25 Apr 2012
Updated	25 Apr 2012
Type	Journal
Year	2012
Where	VEE
Authors	Shriram Rajagopalan, Brendan Cully, Ryan O'Connor, Andrew Warfield

Comments (0)

Sciweavers

SecondSite: disaster tolerance as a service

Disaster Tolerance | Failure Detection | Fault Tolerance | VEE 2012 | Virtualization |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers