We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking...
Cynthia Dwork, Ravi Kumar, Moni Naor, D. Sivakumar
A heterogeneous community of practice spans many disciplines, industries and professions. Members of these communities are united by common research, products and experiences but ...
The Web is a typical example of a social network. One of the most intriguing features of the Web is its self-organization behavior, which is usually faced through the existence of ...
The research reported in this paper is the first phase of a larger project on the automatic classification of Web pages by their genres. The long term goal is the incorporation of...