Abstract. Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also inherently - carry valuable info...
Stephane Corlosquet, Renaud Delbru, Tim Clark, Axe...
The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents,...
Sandip Debnath, Tracy Mullen, Arun Upneja, C. Lee ...
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...