We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Query suggestion has been an effective approach to help users narrow down to the information they need. However, most of existing studies focused on only popular/head queries. Si...
The next wave in search technology will be driven by the identification, extraction, and exploitation of real-world entities represented in unstructured textual sources. Search sy...
Abstract. Advertising on the World Wide Web is based around automatically matching web pages with appropriate advertisements, in the form of banner ads, interactive adverts, or tex...
Edward Thomas, Jeff Z. Pan, Stuart Taylor, Yuan Re...
Focused Web browsing activities such as periodically looking up headline news, weather reports, etc., which require only selective fragments of particular Web pages, can be made m...