A major challenge facing grid applications is the appropriate handling of failures. In this paper we address the problem of making parallel Java applications based on Remote Method...
As software Distributed Shared Memory(DSM) systems become attractive on larger clusters, the focus of attention moves toward improving the reliability of systems. In this paper, w...
An efficient on-chip infrastructure for memory test and repair is crucial to enhance yield and availability of SoCs. Therefore embedded memories are commonly equipped with spare r...
We present a collaborative, self-configuring high availability (HA) approach for stream processing that enables low-latency failure recovery while incurring small run-time overhea...
This paper presents a theoretical and experimental study on the limitations of copy-on-write snapshots and incremental backups in terms of data recoverability. We provide mathemat...