Sciweavers

483 search results - page 38 / 97
» Fault Management in P2P-MPI
Sort
View
ICPP
2008
IEEE
14 years 3 months ago
Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems: A Case Study
Despite great efforts on the design of ultra-reliable components, the increase of system size and complexity has outpaced the improvement of component reliability. As a result, fa...
Jiexing Gu, Ziming Zheng, Zhiling Lan, John White,...
PRDC
2007
IEEE
14 years 2 months ago
Implementation of Highly Available OSPF Router on ATCA
This paper proposes a Highly-Available Open Shortest Path First (HA-OSPF) router which consists of two OSPF router modules-active and standby-to support a highavailability network...
Chia-Tai Tsai, Rong-Hong Jan, Chien Chen, Chia-Yua...
ISAS
2007
Springer
14 years 2 months ago
MDDPro: Model-Driven Dependability Provisioning in Enterprise Distributed Real-Time and Embedded Systems
Service oriented architecture (SOA) design principles are increasingly being adopted to develop distributed real-time and embedded (DRE) systems, such as avionics mission computin...
Sumant Tambe, Jaiganesh Balasubramanian, Aniruddha...
COMSUR
2011
198views Hardware» more  COMSUR 2011»
12 years 8 months ago
Optical Layer Monitoring Schemes for Fast Link Failure Localization in All-Optical Networks
—Optical layer monitoring and fault localization serves as a critical functional module in the control and management of optical networks. An efficient monitoring scheme aims at ...
Bin Wu, Pin-Han Ho, Kwan Lawrence Yeung, Já...
ICCS
2007
Springer
14 years 2 months ago
Providing Fault-Tolerance in Unreliable Grid Systems Through Adaptive Checkpointing and Replication
Abstract. As grids typically consist of autonomously managed subsystems with strongly varying resources, fault-tolerance forms an important aspect of the scheduling process of appl...
Maria Chtepen, Filip H. A. Claeys, Bart Dhoedt, Fi...