Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like text classification, word-sense disambiguation, and drug design. These applications involve a large number of examples n as well as a large number of features N, while each example has only s << N non-zero features. This paper presents a Cutting-Plane Algorithm for training linear SVMs that provably has training time O(sn) for classification problems and O(sn log(n)) for ordinal regression problems. The algorithm is based on an alternative, but equivalent formulation of the SVM optimization problem. Empirically, the Cutting-Plane Algorithm is several orders of magnitude faster than decomposition methods like SVMLight for large datasets. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning General Terms Algorithms, Performance, Experimentation Keywords Support Vector Machines (SVM)...