: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
The primary function of current Web search engines is essentially relevance ranking at the document level. However, myriad structured information about real-world objects is embed...
Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen,...
Taxonomic case-based reasoning is a conversational casebased reasoning methodology that employs feature subsumption taxonomies for incremental case retrieval. Although this approa...
We investigate the novel problem of event recognition from news webpages. "Events" are basic text units containing news elements. We observe that a news article is always...
In this paper we propose to define a measure of visual similarity to compare different pages in a corpus. This measure is based on the analysis of the visual layout saliency of th...