We describe a segmentation method and associated file format for storing images of color documents. We separate each page of the document into three layers, containing the backgro...
Daniel P. Huttenlocher, Pedro F. Felzenszwalb, Wil...
XML is poised to take the World-Wide-Web to the next level of innovation. XML data, large or small, with or without associated schema, will be exchanged between increasing number ...
Management and retrieval of large volumes of text can be expensive in both space and time. Moreover, the range of document sizes in a large collection such as trec presents difficu...
Alistair Moffat, Ron Sacks-Davis, Ross Wilkinson, ...
: This article describes a multilayer model-based approach for text compression. It uses linguistic information to develop a multilayer decomposition model of the text in order to ...
Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled. Due...