Sciweavers

PREMI
2011
Springer

Finding Potential Seeds through Rank Aggregation of Web Searches

13 years 2 months ago
Finding Potential Seeds through Rank Aggregation of Web Searches
This paper presents a potential seed selection algorithm for web crawlers using a gain - share scoring approach. Initially we consider a set of arbitrarily chosen tourism queries. Each query is given to the selected N commercial Search Engines (SEs); top m search results for each SE are obtained, and each of these m results is manually evaluated and assigned a relevance score. For each of m results, a gain - share score is computed using their hyperlinks structure across N ranked lists. Gain score of each link present in each of m results and a portion of the gain score is propagated to the share score of each of m results. This updated share scores of each of m results determine the potential set of seed URLs for web crawling. Experimental results on tourism related web data illustrate the effectiveness of the proposed seed selection algorithm.
Rajendra Prasath, Pinar Öztürk
Added 17 Sep 2011
Updated 17 Sep 2011
Type Journal
Year 2011
Where PREMI
Authors Rajendra Prasath, Pinar Öztürk
Comments (0)