We study the problem of learning using combinations of machines. In particular we present new theoretical bounds on the generalization performance of voting ensembles of kernel machines. Special cases considered are bagging and support vector machines. We present experimental results supporting the theoretical bounds, and describe characteristics of kernel machines ensembles suggested from the experimental findings. We also show how such ensembles can be used for fast training with very large datasets.