In many machine learning applications, like Brain - Computer Interfaces (BCI), only high-dimensional noisy data are available rendering the discrimination task non-trivial. In this work, we focus on feature selection, more precisely on optimal electrode selection and weighting, as an efficient tool to improve the BCI classification procedure. The proposed framework closely integrates spatial feature selection and weighting within the classification task itself. Spatial weights are considered as hyper-parameters to be learned by a Support Vector Machine (SVM). The resulting spatially weighted SVM (sw-SVM) is then designed to maximize the margin between classes whilst minimizing the generalization error. Experimental studies on eight Error Related Potential (ErrP) data sets, illustrate the efficiency of the sw-SVM from a physiological and a machine learning point of view.