This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the numberofconstituent groups (componentsor classes) which best describes some data. We apply the MinimumMessage Length (MML) criterion to the unsupervised learning problem, modifying an earlier such MML application. We give an empirical comparison of criteria prominent in the literature for estimating the number of components in a data set. We conclude that the Minimum Message Length criterion performs better than the alternatives on the data considered here for unsupervised learning tasks.
Jonathan J. Oliver, Rohan A. Baxter, Chris S. Wall