The knowledge discovery process encounters the difficulties to analyze large amount of data. Indeed, some theoretical problems related to high dimensional spaces then appear and degrade the predictive capacity of algorithms. In this paper, we propose a new methodology to get a better representation and prediction of huge datasets. For that purpose, an ensemble approach is used to overcome problems related to high dimensional spaces. Self-Organized Map, which allows both a fast learning and a navigation through the data is used like base classifiers to learn several feature subspaces. A genetic algorithm optimizes diversity of the ensemble thanks to an adapted error measure. The experimentations show that this measure helps to construct a concise ensemble keeping representation capabilities. Furthermore, this approach is competitive in prediction with Boosting and Random Forests. I. MOTIVATIONS Because storage was no more subjected to important constraints of cost, the information syste...