Active Appearance Models (AAMs) have been extensively used for face alignment during the last 20 years. While AAMs have numerous advantages relative to alternate approaches, they suffer from two major drawbacks: (i) AAMs are especially prone to local minima in the fitting process; (ii) few if any of the local minima of the cost function correspond to acceptable solutions. To minimize these problems, this paper proposes a method to learn the fitting cost function that explicitly optimizes that the local minima occur at and only at the places corresponding to the correct fitting parameters. The paper explores two methods to parameterize the cost function: pixel weighting and subspace learning. Experiments on synthetic and real data show the effectiveness of our approach for face alignment.