Recent study has shown that canonical algorithms such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) can be obtained from graph based dimensionality reduction framework. However, these algorithms yield projective maps which are linear combination of all the original features. The results are difficult to be interpreted psychologically and physiologically. This paper presents a novel technique for learning a sparse projection over graphs. The data in the reduced subspace is represented as a linear combination of a subset of the most relevant features. Comparing to PCA and LDA, the results obtained by sparse projection are often easier to be interpreted. Our algorithm is based on a graph embedding model, which encodes the discriminating and geometrical structure in terms of the data affinity. Once the embedding results are obtained, we then apply regularized regression for learning a set of sparse basis functions. Specifically, by using a L1-norm regularize...