Several Web sites deliver a large number of pages, each publishing data about one instance of some real world entity, such as an athlete, a stock quote, a book. Even though it is ...
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, ...
First generation Web-content encodes information in handwritten (HTML) Web pages. Second generation Web content generates HTML pages on demand, e.g. by filling in templates with c...
Jacco van Ossenbruggen, Joost Geurts, Frank Cornel...
There is an increasing amount of structure on the Web as a result of modern Web languages, user tagging and annotation, and emerging robust NLP tools. These meaningful, semantic, ...
In a selective information dissemination (SDI) system, users submit profiles consisting of a number of long-standing queries to represent their information needs. The system then c...
Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance of web search. In this paper, we explore the use of page segmentati...