We report on three distinct experiments that provide new valuable insights into learning algorithms and datasets. We first describe two effective meta-features that significantly impact the predictive accuracy of a broad range of learning algorithms. We then introduce a new efficient metafeature that measures the degree of hardness (or difficulty) of datasets and show that it is highly linearly correlated with predictive accuracy. Finally, we use the notion of classifier output difference to cluster learning algorithms and show that learning algorithms from different model classes may demonstrate highly similar behaviors.
Jun Won Lee, Christophe G. Giraud-Carrier