Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classifiers, and the time required to execute them at run-time, prohibits their use in applications where test sets are large (e.g. Google), where storage space is at a premium (e.g. PDAs), and where computational power is limited (e.g. hearing aids). We present a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance. Categories and Subject Descriptors: I.5.1 [Pattern Recognition]: Models ? Neural nets. General Terms: Algorithms, Experimentation, Measurement, Performance, Reliability.