Hardware failures in autonomous and distributed software systems create the need for self-healing activities. This work addresses the problem of redeploying software components af...
In this paper, we argue that the reliability of large-scale storage systems can be significantly improved by using better reliability metrics and more efficient policies for rec...