Many settings of unsupervised learning can be viewed as quantization problems — the minimization of the expected quantization error subject to some restrictions. This allows the use of tools such as regularization from the theory of (supervised) risk minimization for unsupervised settings. Moreover, this setting is very closely related to both principal curves and the generative topographic map. We explore this connection in two ways: 1) we propose an algorithm for finding principal manifolds that can be regularized in a variety of ways. Experimental results demonstrate the feasibility of the approach. 2) We derive uniform convergence bounds and hence bounds on the learning rates of the algorithm. In particular, we give good bounds on the covering numbers which allows us to obtain a nearly optimal learning rate of order O(m− 1 2 +α ) for certain types of regularization operators, where m is the sample size and α an arbitrary positive constant.
Alex J. Smola, Robert C. Williamson, Sebastian Mik