Sciweavers

472 search results - page 16 / 95
» Crawling the Hidden Web
Sort
View
HT
2006
ACM
14 years 1 months ago
Evaluation of crawling policies for a web-repository crawler
We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
Frank McCown, Michael L. Nelson
ADAPTIVE
2007
Springer
14 years 1 months ago
Adaptive Focused Crawling
The large amount of available information on the Web makes it hard for users to locate resources about particular topics of interest. Traditional search tools, e.g., search engines...
Alessandro Micarelli, Fabio Gasparetti
SIGKDD
2008
248views more  SIGKDD 2008»
13 years 7 months ago
Web data mining: exploring hyperlinks, contents, and usage data
This paper presents a review of the book "Web Data Mining - Exploring Hyperlinks, Contents, and Usage Data" by Bing Liu. The review concludes that the breadth and depth ...
Olfa Nasraoui
ICAPR
2005
Springer
14 years 1 months ago
Combining Text and Link Analysis for Focused Crawling
The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we de...
George Almpanidis, Constantine Kotropoulos
WIDM
2006
ACM
14 years 1 months ago
Lazy preservation: reconstructing websites by crawling the crawlers
Backup of websites is often not considered until after a catastrophic event has occurred to either the website or its webmaster. We introduce “lazy preservation” – digital p...
Frank McCown, Joan A. Smith, Michael L. Nelson