: We analyse 18 evaluation methods for learning algorithms and classifiers, and show how to categorise these methods with the help of an evaluation method taxonomy based on several criteria. We also define a formal framework that make it possible to describe all methods using the same terminology, and apply it in a review of the state-of-the-art in learning algorithm and classifier evaluation. The framework enables comparison and deeper understanding of evaluation methods from different fields of research. Moreover, we argue that the framework and taxonomy support the process of finding candidate evaluation methods for a particular problem.