Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closel...
: For many readers, handling a physical book is an enjoyably exquisite part of the information seeking process. Many physical characteristics of a book—its size, heft, the patina...
Yi-Chun Chu, David Bainbridge, Matt Jones, Ian H. ...
This paper presents the result of an adaptive region growing segmentation technique for color document images using an irregular pyramid structure. The emphasis is in the segmentat...
The popularity of XML has motivated the development of novel XML processing tools many of which embed the XPath language for XML querying, transformation, constraint specification...
Mariano P. Consens, John W. S. Liu, Flavio Rizzolo