This paper presents a novel method for reducing the dimensionality of kernel spaces. Recently, to maintain the convexity of training, loglinear models without mixtures have been used as emission probability density functions in hidden Markov models for automatic speech recognition. In that framework, nonlinearly-transformed high-dimensional features are used to achieve the nonlinear classification of the original observation vectors without using mixtures. In this paper, with the goal of using high-dimensional features in kernel spaces, the cutting plane subspace pursuit method proposed for support vector machines is generalized and applied to log-linear models. The experimental results show that the proposed method achieved an efficient approximation of the feature space by using a limited number of basis vectors