Enterprise systems must guarantee high availability and reliability to provide 24/7 services without interruptions and failures. Mechanisms for handling exceptional cases and impl...
We propose a new paradigm for software availability enhancement. We offer a two-step strategy: Failure prediction followed by maintenance actions with the objective of avoiding imp...
Studies of the fault-tolerance of graphs have tended to largely concentrate on classical graph connectivity. This measure is very basic, and conveys very little information for des...
Vijay Lakamraju, Zahava Koren, Israel Koren, C. Ma...
As grid computation systems become larger and more complex, manually diagnosing failures in jobs becomes impractical. Recently, machine-learning techniques have been proposed to d...