Sciweavers

89 search results - page 7 / 18
» The overhead of consensus failure recovery
Sort
View
ICS
2011
Tsinghua U.
12 years 11 months ago
High performance linpack benchmark: a fault tolerant implementation without checkpointing
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Teresa Davies, Christer Karlsson, Hui Liu, Chong D...
IPPS
2007
IEEE
14 years 1 months ago
DejaVu: Transparent User-Level Checkpointing, Migration, and Recovery for Distributed Systems
In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
CN
2007
96views more  CN 2007»
13 years 7 months ago
Persistent detection and recovery of state inconsistencies
Soft-state is a well established approach to designing robust network protocols and applications. However it is unclear how to apply soft-state approach to protocols that must mai...
Lan Wang, Daniel Massey, Lixia Zhang
MICRO
2010
IEEE
186views Hardware» more  MICRO 2010»
13 years 5 months ago
SAFER: Stuck-At-Fault Error Recovery for Memories
As technology scaling poses a threat to DRAM scaling due to physical limitations such as limited charge, alternative memory technologies including several emerging non-volatile me...
Nak Hee Seong, Dong Hyuk Woo, Vijayalakshmi Sriniv...
ICDCS
2008
IEEE
14 years 2 months ago
Can We Really Recover Data if Storage Subsystem Fails?
This paper presents a theoretical and experimental study on the limitations of copy-on-write snapshots and incremental backups in terms of data recoverability. We provide mathemat...
Weijun Xiao, Qing Yang