Abstract. This paper presents an automatic approach to mining collections of maps from the Web. Our method harvests images from the Web and then classifies them as maps or non-maps by comparing them to previously classified map and non-map images using methods from Content-Based Image Retrieval (CBIR). Our approach outperforms the accuracy of the previous approach by 20% in F1-measure. Further, our method is more scalable and less costly than previous approaches that rely on more traditional machine learning techniques.
Matthew Michelson, Aman Goel, Craig A. Knoblock