Sciweavers

295 search results - page 19 / 59
» Invariants Based Failure Diagnosis in Distributed Computing ...
Sort
View
IWDC
2004
Springer
156views Communications» more  IWDC 2004»
14 years 29 days ago
Agent-Based Distributed Intrusion Alert System
Intrusion detection for computer systems is a key problem in today’s networked society. Current distributed intrusion detection systems (IDSs) are not fully distributed as most o...
Arjita Ghosh, Sandip Sen
FAST
2007
13 years 9 months ago
Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You?
Component failure in large-scale IT installations is becoming an ever larger problem as the number of components in a single cluster approaches a million. In this paper, we presen...
Bianca Schroeder, Garth A. Gibson
ICPP
2009
IEEE
14 years 2 months ago
Exploring the Cost-Availability Tradeoff in P2P Storage Systems
—P2P storage systems use replication to provide a certain level of availability. While the system must generate new replicas to replace replicas lost to permanent failures, it ca...
Zhi Yang, Yafei Dai, Zhen Xiao
ICDCSW
2002
IEEE
14 years 16 days ago
Hermes: A Distributed Event-Based Middleware Architecture
In this paper, we argue that there is a need for an event-based middleware to build large-scale distributed systems. Existing publish/subscribe systems still have limitations comp...
Peter R. Pietzuch, Jean Bacon
TPDS
1998
135views more  TPDS 1998»
13 years 7 months ago
On Coordinated Checkpointing in Distributed Systems
—Coordinated checkpointing simplifies failure recovery and eliminates domino effects in case of failures by preserving a consistent global checkpoint on stable storage. However, ...
Guohong Cao, Mukesh Singhal