We commonly come across pages that are not of interest while searching the web. This is partly due to a word or words in the search query having different contexts, the user obvio...
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...
In this paper we study how to build an effective incremental crawler. The crawler selectively and incrementally updates its index and/or local collection of web pages, instead of ...
This paper presents the method for retrieving and summarizing changes in topics from online resources. Users often want to know what are the major changes in their areas of intere...
Tables are widely used in web pages. Unfortunately, most web tables can only be passively accessed but cannot be interactively accessed, that is, users can view information display...