: The traditional latent class analysis (LCA) uses a mixture model with binary responses on each subject that are independent conditional on cluster membership. However, in many practical applications, the responses are correlated because they are observed on the same subject; this is known as local dependence. In this paper, we extend the LCA model to allow for local dependence in each cluster to improve clustering accuracy. The clustering problem is hard because of its unsupervised learning nature (the true cluster memberships and even the true number of clusters are unknown), the difficulty of estimating a correlation matrix for each cluster and the paucity of information in binary data. Therefore, we follow a parametric approach in which we fit a mixture model whose components follow multivariate Bernoulli distributions (one for each cluster). An extension of a family of parametric models by Oman and Zucker [1] is adopted for this purpose and the maximum likelihood estimation metho...
Ajit C. Tamhane, Dingxi Qiu, Bruce E. Ankenman