Sciweavers

197 search results - page 20 / 40
» A Framework for Proactive Fault Tolerance
Sort
View
CCGRID
2006
IEEE
13 years 11 months ago
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
Chokchai Leangsuksun, Tirumala Rao, Anand Tikoteka...
PVLDB
2008
103views more  PVLDB 2008»
13 years 7 months ago
A request-routing framework for SOA-based enterprise computing
Enterprises may use a service-oriented architecture (SOA) to provide a streamlined interface to their business processes. To scale up the system, each tier in a composite service ...
Thomas Phan, Wen-Syan Li
ISCA
2010
IEEE
170views Hardware» more  ISCA 2010»
14 years 24 days ago
Relax: an architectural framework for software recovery of hardware faults
As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whe...
Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaral...
VLSID
2004
IEEE
117views VLSI» more  VLSID 2004»
14 years 8 months ago
Evaluating the Reliability of Defect-Tolerant Architectures for Nanotechnology with Probabilistic Model Checking
As we move from deep submicron technology to nanotechnology for device manufacture, the need for defect-tolerant architectures is gaining importance. This is because, at the nanos...
Gethin Norman, David Parker, Marta Z. Kwiatkowska,...
PVM
2010
Springer
13 years 6 months ago
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...