In many applied problems in the context of pattern recognition, the data often involve highly asymmetric observations. Normal mixture models tend to overfit when additional components are included to capture the skewness of the data. Increased number of pseudo-components could lead to difficulties and inefficiencies in computations. Also, the contours of the fitted mixture components may be distorted. In this paper, we propose to adopt mixtures of multivariate skew t distributions to handle highly asymmetric data. The EM algorithm is used to compute the maximum likelihood estimates of model parameters. The method is illustrated using a flurorescence-activated cell sorting data. Keywords-Asymmetric multivariate data; EM algorithm; fluorescene-activated cell sorting; mixture models; skewed t.
Kui Wang, Shu-Kay Ng, Geoffrey J. McLachlan