We consider the linear classification method consisting of separating two sets of points in d-space by a hyperplane. We wish to determine the hyperplane which minimises the sum of distances from all misclassified points to the hyperplane. To this end two local descent methods are developed, one grid-based and one optimisation-theory based, and are embedded in several ways into a VNS metaheuristic scheme. Computational results show these approaches to be complementary, leading to a single hybrid VNS strategy which combines both approaches to exploit the strong points of each. Extensive computational tests show that the resulting method performs well. Keywords. Data Mining, Classification, Linear Classification, Heuristic Minimisation, Normdistance, Variable Neighbourhood Search, Variable Neighborhood Search, VNS, Local Search, Grid Search, Cell Search.