Recently, the margin criterion has been successfully used for parameter optimization in graphical models. We introduce maximum margin based structure learning for Bayesian network classifiers and demonstrate its advantages in terms of classification performance compared to traditionally used discriminative structure learning methods. In particular, we provide empirical results for generative structure learning and two discriminative structure learning approaches on handwritten digit recognition tasks. We show that maximum margin structure learning outperforms other structure learning methods. Furthermore, we present classification results achieved with different bitwidth for representing the parameters of the classifiers.