This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Developing personalized applications for the ubiquitous Web assumes to provide different user interfaces addressing heterogeneous capabilities of device classes. Major problems are...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
This paper presents Prospector, an adaptive meta-search layer, which performs personalized re-ordering of search results. Prospector combines elements from two approaches to adapti...
Accessing and integrating data from heterogeneous sources has become a significant challenge. So-called adapters provide the functionality for translating SQL queries into querie...