The paper describes a method for automatically extracting informative feature hierarchies for object classification, and shows the advantage of the features constructed hierarchically over previous methods. The extraction process proceeds in a top-down manner: informative top-level fragments are extracted first, and by a repeated application of the same feature extraction process the classification fragments are broken down successively into their own optimal components. The hierarchical decomposition terminates with atomic features that cannot be usefully decomposed into simpler features. The entire hierarchy, the different features and sub-features, and their optimal parameters, are learned during a training phase using training examples. Experimental comparisons show that these feature hierarchies are significantly more informative and better for classification compared with similar non-hierarchical features as well as previous methods for using feature hierarchies.