Sciweavers

615 search results - page 4 / 123
» An Architecture for Supporting Network Fault Recovery Manage...
Sort
View
SRDS
1999
IEEE
13 years 11 months ago
Fault-Tolerant Replication Management in Large-Scale Distributed Storage Systems
Failures of all forms happen: from losing single network packets to site-wide disasters. Since businesses rely heavily on their data, it is imperative that failures require minima...
Richard A. Golding, Elizabeth Borowsky
DAC
2011
ACM
12 years 7 months ago
DRAIN: distributed recovery architecture for inaccessible nodes in multi-core chips
As transistor dimensions continue to scale deep into the nanometer regime, silicon reliability is becoming a chief concern. At the same time, transistor counts are scaling up, ena...
Andrew DeOrio, Konstantinos Aisopos, Valeria Berta...
WWW
2005
ACM
14 years 8 months ago
Advanced fault analysis in web service composition
Currently, fault management in Web Services orchestrating multiple suppliers relies on a local analysis, that does not span across individual services, thus limiting the effective...
Anna Goy, Claudia Picardi, Daniele Theseider Dupr&...
ICDCS
1999
IEEE
13 years 11 months ago
HiFi: A New Monitoring Architecture for Distributed Systems Management
With the increasing complexity of large-scale distributed (LSD) systems, an efficient monitoring mechanism has become an essential service for improving the performance and reliab...
Ehab S. Al-Shaer, Hussein M. Abdel-Wahab, Kurt Mal...
MMNS
2001
110views Multimedia» more  MMNS 2001»
13 years 8 months ago
A Framework for Supporting Intelligent Fault and Performance Management for Communication Networks
Abstract. In this paper, we present a framework for supporting intelligent fault and performance management for communication networks. Belief networks are taken as the basis for k...
Hongjun Li, John S. Baras