Flow cytometry (FC) is a powerful technology for rapid multivariate analysis and functional discrimination of cells. Current FC platforms generate large, high-dimensional datasets which pose a significant challenge for traditional manual bivariate analysis. Automated multivariate clustering, though highly desirable, is also stymied by the critical requirement of identifying rare populations that form rather small clusters, in addition to the computational challenges posed by the large size and dimensionality of the datasets. In this paper, we address these twin challenges by developing a two-stage scalable multivariate parametric clustering algorithm. In the first stage, we model the data as a mixture of Gaussians and use an iterative weighted sampling technique to estimate the mixture components successively in order of decreasing size. In the second stage, we apply a graphbased hierarchical merging technique to combine Gaussian components with significant overlaps into the final...