Advances in data collection technologies allow accumulation of large and high dimensional datasets and provide opportunities for learning high quality classification and regression models. However, supervised learning from such data raises significant computational challenges including inability to preserve the data in computer main memory and the need for optimizing model parameters within given time constraints. For certain types of prediction models techniques have been developed for learning from large datasets, but few of them address efficient learning of neural networks. Towards this objective, in this study we proposed a procedure that automatically learns a series of neural networks of different complexities on smaller data chunks and then properly combines them into an ensemble predictor through averaging. Based on the idea of progressive sampling the proposed approach starts with a very simple network trained on a very small sample and then progressively increases the model ...