Intrusion detection for computer systems is a key problem in today’s networked society. Current distributed intrusion detection systems (IDSs) are not fully distributed as most o...
Component failure in large-scale IT installations is becoming an ever larger problem as the number of components in a single cluster approaches a million. In this paper, we presen...
—P2P storage systems use replication to provide a certain level of availability. While the system must generate new replicas to replace replicas lost to permanent failures, it ca...
In this paper, we argue that there is a need for an event-based middleware to build large-scale distributed systems. Existing publish/subscribe systems still have limitations comp...
—Coordinated checkpointing simplifies failure recovery and eliminates domino effects in case of failures by preserving a consistent global checkpoint on stable storage. However, ...