— Optimal component analysis (OCA) uses a stochastic gradient optimization process to find optimal representations for general criteria and shows good performance in object recognition applications. However, OCA often requires extensive computation for gradient estimation and linear representation updating. To significantly reduce the required computation, in this paper, a multi-stage learning process is proposed which decomposes the original optimization problem into several levels. As the learning process at each level starts with a good initial point obtained from next level, the multistage OCA algorithm can speed up the original algorithm significantly and make OCA learning feasible for many applications. We illustrate the effectiveness of the proposed method on the application of face classification.