Abstract. A convolutional network architecture termed sparse convolutional neural network (SCNN) is proposed and tested on a real-world classification task (car classification). In addition to the error function based on the mean squared error (MSE), approximate decorrelation between hidden layer neurons is enforced by a weight orthogonalization mechanism. The aim is to obtain a sparse coding of the objects' visual appearance, thus removing the need for a dedicated feature selection stage. Working on unprocessed image data only, it is demonstrated that classification accuracies can be improved by the proposed method compared to purely MSE-trained SCNNs and fully-connected multilayer perceptron architectures.