We present in this paper a method to discover the set of webpages contained in a logical website, based on the link structure of the Web graph. Such a method is useful in the conte...
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Abstract. Nowadays, the most dominant and noteworthy web information sources are developed according to the collaborative-web paradigm, also known as Web 2.0. In particular, it rep...
Given the increasing traffic on the World Wide Web (Web), it is difficult for a single popular Web server to handle the demand from its many clients. By clustering a group of Web ...
Finding information about people on the Web using a search engine is difficult because there is a many-to-many mapping between person names and specific persons (i.e. referents). ...