We consider multivariate density estimation with identically distributed observations. We study a density estimator which is a convex combination of functions in a dictionary and the convex combination is chosen by minimizing the L2 empirical risk in a stagewise manner. We derive the convergence rates of the estimator when the estimated density belongs to the L2 closure of the convex hull of a class of functions which satisfies entropy conditions. The L2 closure of a convex hull is a large non-parametric class but under suitable entropy conditions the convergence rates of the estimator do not depend on the dimension, and density estimation is feasible also in high dimensional cases. The variance of the estimator does not increase when the number of components of the estimator increases. Instead, we control the bias-variance trade-off by the choice of the dictionary from which the components are chosen. Mathematics Subject Classifications: 62G07 Key Words: Boosting, empirical risk m...