We propose a formulation of the Decision Tree learning algorithm in the Compression settings and derive tight generalization error bounds. In particular, we propose Sample Compression and Occam's Razor bounds. We show how such bounds, unlike the VC dimension or Rademacher complexities based bounds, are more general and can also perform a margin-sparsity trade-off to obtain better classifiers. Potentially, these risk bounds can also guide the model selection process and replace traditional pruning strategies.