We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Web search engines are facing formidable performance challenges due to data sizes and query loads. The major engines have to process tens of thousands of queries per second over t...
The primary business model behind Web search is based on textual advertising, where contextually relevant ads are displayed alongside search results. We address the problem of sel...
Filip Radlinski, Andrei Z. Broder, Peter Ciccolo, ...
Abstract. This work presents an experimental comparison of intersection algorithms for sorted sequences, including the recent algorithm of Baeza-Yates. This algorithm performs on a...
Due to the reliance on the textual information associated with an image, image search engines on the Web lack the discriminative power to deliver visually diverse search results. ...
Reinier H. van Leuken, Lluis Garcia Pueyo, Ximena ...