In this paper, we present a new online failure forecast system to achieve predictive failure management for fault-tolerant data stream processing. Different from previous reactive ...
Xiaohui Gu, Spiros Papadimitriou, Philip S. Yu, Sh...
A new fault tolerant architecture that provides tolerance to a broad scope of hardware, software, and communications faults is being developed. This architecture relies on widely ...
A new hardware developmental model that shows strong robust transient fault-tolerant abilities and is motivated by embryonic development and a honeycomb structure is presented. Ca...
Software testing and software fault tolerance are two major techniques for developing reliable software systems, yet limited empirical data are available in the literature to eval...
Michael R. Lyu, Zubin Huang, Sam K. S. Sze, Xia Ca...
We initiate the study of error confinement in distributed applications, where the goal is that only nodes that were directly hit by a fault may deviate from their correct external...