— We discuss sparse support vector machines (sparse SVMs) trained in the reduced empirical feature space. Namely, we select the linearly independent training data by the Cholesky factorization of the kernel matrix, and train the SVM in the dual form in the reduced empirical feature space. Since the mapped linearly independent training data span the empirical feature space, the linearly independent training data become support vectors. Thus if the number of linearly independent data is smaller than the number of support vectors trained in the feature space, sparsity is increased. By computer experiments we show that in most cases we can reduce the number of support vectors without deteriorating the generalization ability.