Virtually all methods of learning dynamic systems from data start from the same basic assumption: the learning algorithm will be given a sequence of data generated from the dynamic system. We consider the case where the training data comes from the system's operation but with no temporal ordering. The data are simply drawn as individual disconnected points. While making this assumption may seem absurd at first glance, many scientific modeling tasks have exactly this property. Previous work proposed methods for learning linear, discrete time models under these assumptions by optimizing approximate likelihood functions. We extend those methods to nonlinear models using kernel methods. We go on to propose a new approach that focuses on achieving temporal smoothness in the learned dynamics. The result is a convex criterion that can be easily optimized and often outperforms the earlier methods. We test these methods on several synthetic data sets including one generated from the Loren...