Sciweavers

179 search results - page 9 / 36
» A Fault Detection Service for Wide Area Distributed Computat...
Sort
View
IPPS
2010
IEEE
13 years 6 months ago
A general algorithm for detecting faults under the comparison diagnosis model
We develop a widely applicable algorithm to solve the fault diagnosis problem in certain distributed-memory multiprocessor systems in which there are a limited number of faulty pr...
Iain A. Stewart
VEE
2012
ACM
215views Virtualization» more  VEE 2012»
12 years 4 months ago
SecondSite: disaster tolerance as a service
This paper describes the design and implementation of SecondSite, a cloud-based service for disaster tolerance. SecondSite extends the Remus virtualization-based high availability...
Shriram Rajagopalan, Brendan Cully, Ryan O'Connor,...
WWW
2003
ACM
14 years 2 months ago
WS-Membership - Failure Management in a Web-Services World
An important factor in the successful deployment of federated web-services-based business activities will be the ability to guarantee reliable distributed operation and execution....
Werner Vogels, Christopher Ré
GPC
2007
Springer
14 years 3 months ago
Fault Management in P2P-MPI
We present in this paper the recent developments done in P2P-MPI, a grid middleware, concerning the fault management, which covers fault-tolerance for applications and fault detect...
Stéphane Genaud, Choopan Rattanapoka
KDD
2004
ACM
124views Data Mining» more  KDD 2004»
14 years 9 months ago
Eigenspace-based anomaly detection in computer systems
We report on an automated runtime anomaly detection method at the application layer of multi-node computer systems. Although several network management systems are available in th...
Hisashi Kashima, Tsuyoshi Idé