Many clustering algorithms fail when dealing with high dimensional data. Principal component analysis (PCA) is a popular dimensionality reduction algorithm. However, it assumes a single multivariate Gaussian model, which provides a global linear projection of the data. Mixture of probabilistic principal component analyzers (PPCA) provides a better model to the clustering paradigm. It provides a local linear PCA projection for each multivariate Gaussian cluster component. We extend this model to build hierarchical mixtures of PPCA. Hierarchical clustering provides a flexible representation showing relationships among clusters in various perceptual levels. We introduce an automated hierarchical mixture of PPCA algorithm, which utilizes the integrated classification likelihood as a criterion for splitting and stopping the addition of hierarchical levels. An automated approach requires automated methods for initialization, determining the number of principal component dimensions, and dete...
Ting Su, Jennifer G. Dy