Sciweavers

535 search results - page 31 / 107
» Fault tolerant high performance computing by a coding approa...
Sort
View
PDP
2008
IEEE
14 years 2 months ago
System-Level Virtualization for High Performance Computing
System-level virtualization has been a research topic since the 70’s but regained popularity during the past few years because of the availability of efficient solution such as...
Geoffroy Vallée, Thomas Naughton, Christian...
ICDCS
2008
IEEE
14 years 2 months ago
stdchk: A Checkpoint Storage System for Desktop Grid Computing
— Checkpointing is an indispensable technique to provide fault tolerance for long-running high-throughput applications like those running on desktop grids. This paper argues that...
Samer Al-Kiswany, Matei Ripeanu, Sudharshan S. Vaz...
ET
2008
92views more  ET 2008»
13 years 7 months ago
Hardware and Software Transparency in the Protection of Programs Against SEUs and SETs
Processor cores embedded in systems-on-a-chip (SoCs) are often deployed in critical computations, and when affected by faults they may produce dramatic effects. When hardware harde...
Eduardo Luis Rhod, Carlos Arthur Lang Lisbôa...
DSN
2003
IEEE
14 years 28 days ago
Compiler-Directed Program-Fault Coverage for Highly Available Java Internet Services
Abstract: We present a new approach that uses compilerdirected fault-injection for coverage testing of recovery code in Internet services to evaluate their robustness to operating ...
Chen Fu, Richard P. Martin, Kiran Nagaraja, Thu D....
ICC
2007
IEEE
125views Communications» more  ICC 2007»
14 years 2 months ago
Scalable Fault Diagnosis in IP Networks using Graphical Models: A Variational Inference Approach
In this paper we investigate the fault diagnosis problem in IP networks. We provide a lower bound on the average number of probes per edge using variational inference technique pro...
Rajesh Narasimha, Souvik Dihidar, Chuanyi Ji, Stev...