We propose a classification method based on a decision tree whose nodes consist of linear Support Vector Machines (SVMs). Each node defines a decision hyperplane that classifies part of the feature space. For large classification problems (with many Support Vectors (SVs)) it has the advantage that the classification time does not depend on the number of SVs. Here, the classification of a new sample can be calculated by the dot product with the orthogonal vector of each hyperplane. The number of nodes in the tree has shown to be much smaller than the number of SVs in a non-linear SVM, thus, a significant speedup in classification time can be achieved. For non-linear separable problems, the trivial solution (zero vector) of a linear SVM is analyzed and a new formulation of the optimization problem is given to avoid it.