We apply a variational method to automatically determine the number of mixtures of independent components in high-dimensional datasets, in which the sources may be nonsymmetrically distributed. The data are modeled by clusters where each cluster is described as a linear mixture of independent factors. The variational Bayesian method yields an accurate density model for the observed data without overfitting problems. This allows the dimensionality of the data to be identified for each cluster. The new method was successfully applied to a difficult real-world medical dataset for diagnosing glaucoma.
Kwokleung Chan, Te-Won Lee, Terrence J. Sejnowski