Sciweavers

535 search results - page 12 / 107
» Fault tolerant high performance computing by a coding approa...
Sort
View
HASE
1997
IEEE
13 years 11 months ago
High-Coverage Fault Tolerance in Real-Time Systems Based on Point-to-Point Communication
: The distributed recovery block (DRB) scheme is a widely applicable approach for realizing both hardware and software fault tolerance in real-time distributed and parallel compute...
K. H. Kim, Chittur Subbaraman, Eltefaat Shokri
RTSS
1989
IEEE
13 years 11 months ago
A Distributed Fault Tolerant Architecture for Nuclear Reactor Control and Safety Functions
A new fault tolerant architecture that provides tolerance to a broad scope of hardware, software, and communications faults is being developed. This architecture relies on widely ...
Myron Hecht, J. Agron, S. Hochhauser
TSE
1998
93views more  TSE 1998»
13 years 7 months ago
Xception: A Technique for the Experimental Evaluation of Dependability in Modern Computers
An important step in the development of dependable systems is the validation of their fault tolerance properties. Fault injection has been widely used for this purpose, however wi...
Joao Carreira, Henrique Madeira, João Gabri...
CLUSTER
2004
IEEE
13 years 7 months ago
MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware
Fault tolerance in parallel systems has traditionally been achieved through a combination of redundancy and checkpointing methods. This notion has also been extended to message-pas...
Rajanikanth Batchu, Yoginder S. Dandass, Anthony S...
PRDC
2008
IEEE
14 years 2 months ago
Conjoined Pipeline: Enhancing Hardware Reliability and Performance through Organized Pipeline Redundancy
Reliability has become a serious concern as systems embrace nanometer technologies. In this paper, we propose a novel approach for organizing redundancy that provides high degree ...
Viswanathan Subramanian, Arun K. Somani