—In recent years, there has been a move toward supporting the human element of Web search beyond a simple query box and a ranked list of search results. In this paper, we present...
In this paper we address the problem of unsupervised Web data extraction. We show that unsupervised Web data extraction becomes feasible when supposing pages that are made up of r...
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, doma...
Oren Etzioni, Michael J. Cafarella, Doug Downey, A...
Search results clustering problem is defined as an automatic, on-line grouping of similar documents in a search results list returned from a search engine. In this paper we present...