act 11 We describe an ensemble approach to learning from arbitrarily partitioned data. The partitioning comes from the distributed process12 ing requirements of a large scale simulation. The volume of the data is such that classifiers can train only on data local to a given par13 tition. As a result of the partition reflecting the needs of the simulation, the class statistics can vary from partition to partition. Some 14 classes will likely be missing from some partitions. We combine a fast ensemble learning algorithm with probabilistic majority voting 15 in order to learn an accurate classifier from such data. Results from simulations of an impactor bar crushing a storage canister and from 16 facial feature recognition show that regions of interest are successfully identified in spite of the class imbalance in the individual training 17 sets. 18
Larry Shoemaker, Robert E. Banfield, Lawrence O. H