Sciweavers

ICAC
2007
IEEE

Towards Autonomic Fault Recovery in System-S

14 years 5 months ago
Towards Autonomic Fault Recovery in System-S
System-S is a stream processing infrastructure which enables program fragments to be distributed and connected to form complex applications. There may be potentially tens of thousands of interdependent and heterogeneous program fragments running across thousands of nodes. While the scale and interconnection imply the need for automation to manage the program fragments, the need is intensified because the applications operate on live streaming data and thus need to be highly available. System-S has been designed with components that autonomically manage the program fragments, but the system components themselves are also susceptible to failures which can jeopardize the system and its applications. The work we present addresses the self healing nature of these management components in System-S. In particular, we show how one key component of System-S, the job management orchestrator, can be abruptly terminated and then recover without interrupting any of the running program fragments b...
Gabriela Jacques-Silva, Jim Challenger, Lou Degena
Added 02 Jun 2010
Updated 02 Jun 2010
Type Conference
Year 2007
Where ICAC
Authors Gabriela Jacques-Silva, Jim Challenger, Lou Degenaro, James Giles, Rohit Wagle
Comments (0)