Incorporating feature selection into a classi cation or regression method often carries a number of advantages. In this paper we formalize feature selection speci cally from a discriminative perspective of improving classi cation regression accuracy. The feature selection methodis developed as an extension to the recently proposed maximum entropy discrimination MED framework. We describe MED as a exible Bayesian regularization approachthat subsumes, e.g.,support vector classi cation, regression and exponential family models. For brevity, we restrict ourselves primarily to feature selection in the context of linear classi cation regression methods and demonstrate that the proposed approach indeed carries substantial improvements in practice. Moreover, we discuss and develop various extensions of feature selection, including the problem of dealing with example speci c but unobserved degrees of freedom alignments or invariants.