Search engines use content and link information to crawl, index, retrieve, and rank Web pages. The correlations between similarity measures based on these cues and on semantic ass...
: This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted serie...
This paper describes the development of a new document ranking system based on layout similarity. The user has a need represented by a set of ”wanted” documents, and the syste...
May Huang, Daniel DeMenthon, David S. Doermann, Ly...
The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection s...
Images are being produced and made available in ever increasing numbers; but how can we find images "like this one" that are of interest to us? Many different systems hav...