Abstract Self-healing, i.e. the capability of a system to autonomously detect failures and recover from them, is a very attractive property that may enable large-scale software systems, aimed at delivering services on a 24/7 fashion, to meet their goals with little or no human intervention. Achieving self-healing requires the elicitation and maintenance of domain knowledge in the form of service failure diagnosis, repair plan patterns, a task which can be overwhelming. CaseBased Reasoning (CBR) is a lazy learning paradigm that largely reduces this kind of knowledge acquisition bottleneck. Moreover, the application of CBR for failure diagnosis and remediation in software systems appears to be very suitable, as in this domain most errors are re-occurrences of known problems. In this paper, we describe a CBR approach for providing large-scale, distributed software systems with self-healing capabilities, and demonstrate the practical applicability of our methodology by means of some experi...