Non-linear dimensionality reductionmethods are powerful techniques to deal with
high-dimensional datasets. However, they often are susceptible to local minima
and perform poorly when initialized far from the global optimum, even when the
intrinsic dimensionality is known a priori. In this work we introduce a prior over
the dimensionality of the latent space, and simultaneously optimize both the latent
space and its intrinsic dimensionality. Ad-hoc initialization schemes are unnecessary
with our approach; we initialize the latent space to the observation space and
automatically infer the latent dimensionality using an optimization scheme that
drops dimensions in a continuous fashion. We report results applying our prior
to various tasks involving probabilistic non-linear dimensionality reduction, and
show that our method can outperform graph-based dimensionality reduction techniques
as well as previously suggested ad-hoc initialization strategies.