Sciweavers

178 search results - page 12 / 36
» Scheduling Algorithms for Web Crawling
Sort
View
ESWS
2008
Springer
13 years 9 months ago
Instance Based Clustering of Semantic Web Resources
Abstract. The original Semantic Web vision was explicit in the need for intelligent autonomous agents that would represent users and help them navigate the Semantic Web. We argue t...
Gunnar Aastrand Grimnes, Peter Edwards, Alun D. Pr...
WWW
2011
ACM
13 years 2 months ago
Inverted index compression via online document routing
Modern search engines are expected to make documents searchable shortly after they appear on the ever changing Web. To satisfy this requirement, the Web is frequently crawled. Due...
Gal Lavee, Ronny Lempel, Edo Liberty, Oren Somekh
WWW
2003
ACM
14 years 8 months ago
Dynamic maintenance of web indexes using landmarks
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
WWW
2005
ACM
14 years 8 months ago
Predictive ranking: a novel page ranking approach by estimating the web structure
PageRank (PR) is one of the most popular ways to rank web pages. However, as the Web continues to grow in volume, it is becoming more and more difficult to crawl all the available...
Haixuan Yang, Irwin King, Michael R. Lyu
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
14 years 2 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian