For a stationary stochastic process {Xn} with values in some set A, a finite word w AK is called a memory word if the conditional probability of X0 given the past is constant on the cylinder set defined by X-1 -K = w. It is a called a minimal memory word if no proper suffix of w is also a memory word. For example in a K-step Markov processes all words of length K are memory words but not necessarily minimal. We consider the problem of determining the lengths of the longest minimal memory words and the shortest memory words of an unknown process {Xn} based on sequentially observing the outputs of a single sample {1, 2, ...n}. We will give a universal estimator which converges almost surely to the length of the longest minimal memory word and show that no such universal estimator exists for the length of the shortest memory word. The alphabet A may be finite or countable.