Predictive State Representations (PSRs) have shown a great deal of promise as an alternative to Markov models. However, learning a PSR from a single stream of data generated from an environment remains a challenge. In this work, we present a formalism of PSRs and the domains they model. This formalization suggests an algorithm for learning PSRs that will (almost surely) converge to a globally optimal model given sufficient training data.