This paper presents a novel search paradigm that uses multiple images as input to perform semantic search of images. While earlier focuses on using single or multiple query images to retrieve images with views of the same instance, the proposed paradigm uses each query image to discover common concepts that are implicitly shared by all of the query images and retrieves images considering the found concepts. Our implementation uses high level visual features extracted from a deep convolutional network to retrieve images similar to each query input. These images have associated text previously generated by implicit crowdsourcing. A Bag of Words (BoW) textual representation of each query image is built from the associated text of the retrieved similar images. A learned vector space representation of English words extracted from a corpus of 100 billion words allows computing the conceptual similarity of words. The words that represent the input images are used to find new words that shar...