failure detection | Sciweavers

185

VEE
2012
ACM

215views Virtualization» more VEE 2012»

SecondSite: disaster tolerance as a service

14 years 1 months ago

This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability...

Shriram Rajagopalan, Brendan Cully, Ryan O'Connor,...

claim paper

Read More »

161

click to vote

CLUSTER
1999
IEEE

128views Distributed And Parallel Com...» more CLUSTER 1999»

Simulative performance analysis of gossip failure detection for scalable distributed systems

15 years 5 months ago

Download www.hcs.ufl.edu

Three protocols for gossip-based failure detection services in large-scale heterogeneous clusters are analyzed and compared. The basic gossip protocol provides a means by which fai...

Mark W. Burns, Alan D. George, Bradley A. Wallace

claim paper

Read More »

187

click to vote

CACM
1999

92views more CACM 1999»

Putting OO Distributed Programming to Work

15 years 5 months ago

Download www.engr.sjsu.edu

stractions underlying distributed computing. We attempted to keep our preaims at an abstract and general level. In this column, we make those claims more concrete. More precisely, ...

Pascal Felber, Rachid Guerraoui, Mohamed Fayad

claim paper

Read More »

167

click to vote

SOQUA
2007

115views Software Engineering» more SOQUA 2007»

An approach to detecting failures automatically

15 years 7 months ago

Download www.inf.usi.ch

Failure detection is a diﬃcult and often expensive task. The principle of self-healing addresses this cost issue, but poses new research questions. This work focuses on detectin...

Jochen Wuttke

claim paper

Read More »

207

click to vote

DSN
2004
IEEE

148views Computer Networks» more DSN 2004»

Cluster-Based Failure Detection Service for Large-Scale Ad Hoc Wireless Network Applications

15 years 9 months ago

Download www.ia-tech.com

The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault to...

Ann T. Tai, Kam S. Tso, William H. Sanders

claim paper

Read More »

155

click to vote

WETICE
1999
IEEE

123views Emerging Technology» more WETICE 1999»

A Hierarchical Proxy Architecture for Internet-Scale Event Services

15 years 10 months ago

Download www.ifs.uni-linz.ac.at

The rapid growth of the Web has made it possible to build collaborative applications on an unprecedented scale. However, the request-reply interaction model of HTTP limits the rang...

Haobo Yu, Deborah Estrin, Ramesh Govindan

claim paper

Read More »

147

click to vote

DSN
2003
IEEE

116views Computer Networks» more DSN 2003»

Node Failure Detection and Membership in CANELy

15 years 11 months ago

Download www.navigators.di.fc.ul.pt

Fault-tolerant distributed systems based on ﬁeldbuses may beneﬁt to a great extent from the availabilityof semantically rich communication services,such as those provided by g...

José Rufino, Paulo Veríssimo, Guilhe...

claim paper

Read More »

182

click to vote

KDD
2005
ACM

178views Data Mining» more KDD 2005»

Failure detection and localization in component based systems by online tracking

15 years 11 months ago

Download www.nec-labs.com

The increasing complexity of today’s systems makes fast and accurate failure detection essential for their use in mission-critical applications. Various monitoring methods provi...

Haifeng Chen, Guofei Jiang, Cristian Ungureanu, Ke...

claim paper

Read More »

170

click to vote

SSS
2007
Springer

130views Control Systems» more SSS 2007»

Secure Failure Detection in TrustedPals

16 years 4 days ago

Download www.sc.ehu.es

We present a modular redesign of TrustedPals, a smartcard-based security framework for solving secure multiparty computation (SMC)[?]. TrustedPals allows to reduce SMC to the probl...

Roberto Cortiñas, Felix C. Freiling, Marjan...

claim paper

Read More »

178

click to vote

GPC
2007
Springer

146views Distributed And Parallel Com...» more GPC 2007»

Fault Management in P2P-MPI

16 years 5 days ago

Download icps.u-strasbg.fr

We present in this paper the recent developments done in P2P-MPI, a grid middleware, concerning the fault management, which covers fault-tolerance for applications and fault detect...

Stéphane Genaud, Choopan Rattanapoka

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers