The state-of-the art in visual object retrieval from large
databases allows to search millions of images on the object
level. Recently, complementary works have proposed systems
to crawl large object databases from community photo
collections on the Internet. We combine these two lines of
work to a large-scale system for auto-annotation of holiday
snaps. The resulting method allows for automatic labeling
objects such as landmark buildings, scenes, pieces of art
etc. at the object level in a fully automatic manner. The
labeling is multi-modal and consists of textual tags, geographic
location, and related content on the Internet. Furthermore,
the efficiency of the retrieval process is optimized
by creating more compact and precise indices for visual vocabularies
using background information obtained in the
crawling stage of the system. We demonstrate the scalability
and precision of the proposed method by conducting
experiments on millions of images downloaded from commun...