Sciweavers

104 search results - page 14 / 21
» A Framework for Node-Level Fault Tolerance in Distributed Re...
Sort
View
CODES
2007
IEEE
14 years 3 months ago
Scheduling and voltage scaling for energy/reliability trade-offs in fault-tolerant time-triggered embedded systems
In this paper we present an approach to the scheduling and voltage scaling of low-power fault-tolerant hard real-time applications mapped on distributed heterogeneous embedded sys...
Paul Pop, Kåre Harbo Poulsen, Viacheslav Izo...
TASE
2010
IEEE
13 years 3 months ago
Intelligent Component-Based Automation of Baggage Handling Systems With IEC 61499
Airport Baggage Handling is a field of automation systems that is currently dependent on centralised control systems and conventional automation programming techniques. In this and...
Geoff Black, Valeriy Vyatkin
CCGRID
2006
IEEE
14 years 2 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
PVM
2010
Springer
13 years 7 months ago
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
CCGRID
2006
IEEE
14 years 11 days ago
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
Chokchai Leangsuksun, Tirumala Rao, Anand Tikoteka...