Abstract. The increasing scale complexity, heterogeneity and dynamism of networks, systems and applications have made our computational and information infrastructure brittle, unma...
Key issues to address in autonomic job recovery for cluster computing are recognizing job failure; understanding the failure sufficiently to know if and how to restart the job; an...
Charles Earl, Emilio Remolina, Jim Ong, John Brown
Multiple, highly autonomous, satellite systems are envisioned in the near future because they are capable of higher performance, lower cost, better fault tolerance, reconfigurabil...
Thomas P. Schetter, Mark E. Campbell, Derek M. Sur...
We describe an agent-based situation-aware survivable architecture for the discovery and composition of web services. Our architecture provides for proofs that guaranteethe consis...
A distributed software system's deployment architecture can have a significant impact on the system's dependability. Dependability is a function of various system paramet...
Sam Malek, Nels Beckman, Marija Mikic-Rakic, Nenad...