Abstract. The Gram matrix plays a central role in many kernel methods. Knowledge about the distribution of eigenvalues of the Gram matrix is useful for developing appropriate model selection methods for kernel PCA. We use methods adapted from the statistical physics of classical fluids in order to study the averaged spectrum of the Gram matrix. We focus in particular on a variational mean-field theory and related diagrammatic approach. We show that the mean-field theory correctly reproduces previously obtained asymptotic results for standard PCA. Comparison with simulations for data distributed uniformly on the sphere shows that the method provides a good qualitative approximation to the averaged spectrum for kernel PCA with a Gaussian Radial Basis Function kernel. We also develop an analytical approximation to the spectral density that agrees closely with the numerical solution and provides insight into the number of samples required to resolve the corresponding process eigenvalues...
David C. Hoyle, Magnus Rattray