Bayesian approaches to supervised learning use priors on the classifier parameters. However, few priors aim at achieving "sparse" classifiers, where irrelevant/redundant parameters are automatically set to zero. Two well-known ways of obtaining sparse classifiers are: use a zero-mean Laplacian prior on the parameters, and the "support vector machine" (SVM). Whether one uses a Laplacian prior or an SVM, one still needs to specify/estimate the parameters that control the degree of sparseness of the resulting classifiers. We propose a Bayesian approach to learning sparse classifiers which does not involve any parameters controlling the degree of sparseness. This is achieved by a hierarchicalBayes interpretation of the Laplacian prior, followed by the adoption of a Jeffreys' non-informative hyper-prior. Implementation is carried out by an EM algorithm. Experimental evaluation of the proposed method shows that it performs competitively with (often better than) the ...
Anil K. Jain, Mário A. T. Figueiredo