In this paper we present a family of models and learning algorithms that can simultaneously align and cluster sets of multidimensional curves measured on a discrete time grid. Our approach is based on a generative mixture model that allows both local nonlinear time warping and global linear shifts of the observed curves in both time and measurement spaces relative to the mean curves within the clusters. The resulting model can be viewed as a form of Bayesian network with a special temporal structure. The Expectation-Maximization (EM) algorithm is used to simultaneously recover both the curve models for each cluster, and the most likely alignments and cluster membership for each curve. We evaluate the methodology on two real-world data sets, and show that the Bayesian network models provide systematic improvements in predictive power over more conventional clustering approaches.