We discuss the problem of finding a good state representation in stochastic systems with observations. We develop a duality theory that generalizes existing work in predictive state representations as well as automata theory. We discuss how this theoretical framework can be used to build learning algorithms, approximate planning algorithms as well as to deal with continuous observations.