We present a new, statistical approach to rule learning. Doing so, we address two of the problems inherent in traditional rule learning: The computational hardness of finding rule sets with low training error and the need for capacity control to avoid overfitting. The chosen representation involves weights attached to rules. Instead of optimizing the error rate directly, we optimize for rule sets that have large margin and low variance. This can be formulated as a convex optimization problem allowing for efficient computation. Given the representation and the optimization procedure, we effectively yield weighted clauses in a CNFlike representation. To avoid overfitting, we propose a model selection strategy that utilizes a novel concentration inequality. Empirical tests show that the system is competitive with existing rule learning algorithms and that its flexible learning bias can be adjusted to improve predictive accuracy considerably.