In this paper, we present a novel framework for machine learning-based cross-media knowledge extraction. The framework is specifically designed to handle documents composed of th...
Abstract. Effective indexing is crucial for providing convenient access to scanned versions of large collections of handwritten historical manuscripts. Since traditional handwritin...
An algorithm is presented that automatically matches images of presentation slides to the symbolic source file (e.g., PowerPointTM or AcrobatTM ) from which they were generated. T...
As the largest online marketplace, eBay strives to promote its inventory throughout the Web via different types of online advertisement. Contextually relevant links to eBay assets...
The performance of document clustering systems depends on employing optimal text representations, which are not only difficult to determine beforehand, but also may vary from one ...