When classifying high-dimensional sequence data, traditional methods (e.g., HMMs, CRFs) may require large amounts of training data to avoid overfitting. In such cases dimensionality reduction can be employed to find a low-dimensional representation on which classification can be done more efficiently. Existing methods for supervised dimensionality reduction often presume that the data is densely sampled so that a neighborhood graph structure can be formed, or that the data arises from a known distribution. Sufficient dimension reduction techniques aim to find a low dimensional representation such that the remaining degrees of freedom become conditionally independent of the output values. In this paper we develop a novel sequence kernel dimension reduction approach (S-KDR). Our approach does not make strong assumptions on the distribution of the input data. Spatial, temporal and periodic information is combined in a principled manner, and an optimal manifold is learned for the en...