It has already been shown how Artificial Neural Networks (ANNs) can be incorporated into probabilistic models. In this paper we review some of the approaches which have been proposed to incorporate them into probabilistic models of sequential data, such as Hidden Markov Models (HMMs). We also discuss new developments and new ideas in this area, in particular how ANNs can be used to model high-dimensional discrete and continuous data to deal with the curse of dimensionality, and how the ideas proposed in these models could be applied to statistical language modeling to represent longer-term context than allowed by trigram models, while keeping word-order information.