Latent Semantic Analysis (LSA) has shown encouraging performance for the problem of unsupervised image automatic annotation. LSA conducts annotation by keywords propagation on a linear Latent Space, which accounts for the underlying semantic structure of word and image features. In this paper, we formulate a more general nonlinear model, called Nonlinear Latent Space model, to reveal the latent variables of word and visual features more precisely. Instead of the basic propagation strategy, we present a novel inference strategy for image annotation via Image-Word Embedding (IWE). IWE simultaneously embeds images and words and captures the dependencies between them from a probabilistic viewpoint. Experiments show that IWE-based annotation on the nonlinear latent space outperforms previous unsupervised annotation methods. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing—Indexing methods General Terms Algorithms, Theory Keywords...