Linear Discriminant Analysis (LDA) is a popular tool for multiclass discriminative dimensionality reduction. However, LDA suffers from two major problems: (1) It only optimizes the Bayes error for the case of unimodal Gaussian classes with equal covariances (assuming full rank matrices) and, (2) The multiclass extension maximizes the sum of pairwise distances between the classes, and does not “simultaneously” maximize each pairwise distance between the classes. This typically results in serious overlapping in the projected space between classes that are “close” in the input space. To solve these two problems, this paper proposes Pareto Discriminant Analysis (PARDA). Firstly, PARDA explicitly models each of the classes as a multidimensional Gaussian with a sample covariance. Secondly, PARDA decomposes the multiclass problem to a set of pairwise objective functions representing the pairwise distance between different classes. Unlike existing extensions of Fisher discriminant ana...