We propose a feature selection method that constructs each new feature by analysis of tight error clusters. This is a greedy, time-efficient forward selection algorithm that iteratively constructs one feature at a time, until a desired error rate is reached. The algorithm finds error clusters in the current feature space, then projects one tight cluster into the null space of the feature mapping, where a new feature that helps to classify these errors can be discovered. Tight error clusters indicate that the current features are unable to discriminate these samples. The approach is strongly data-driven and restricted to linear features, but otherwise general. Large scale experiments show that it can achieve a monotonically decreasing error rate within the feature discovery set, and a generally decreasing error rate on a distinct test set.
Henry S. Baird, Sui-Yu Wang