Search Sciweavers | Sciweavers

85 search results - page 6 / 17

» The impact of crawl policy on web search effectiveness

click to vote

KDD
2008
ACM

183views Data Mining» more KDD 2008»

De-duping URLs via rewrite rules

14 years 9 months ago

Download research.yahoo.com

A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...

Anirban Dasgupta, Ravi Kumar, Amit Sasturkar

claim paper

Read More »

click to vote

JIS
2008

119views more JIS 2008»

A three-year study on the freshness of web search engine databases

13 years 8 months ago

Download www.bui.haw-hamburg.de

This paper deals with one aspect of the index quality of search engines: index freshness. The purpose is to analyse the update strategies of the major Web search engines Google, Y...

Dirk Lewandowski

claim paper

Read More »

click to vote

MM
2006
ACM

167views Multimedia» more MM 2006»

Image annotation by large-scale content-based image retrieval

14 years 2 months ago

Download staff.science.uva.nl

Image annotation has been an active research topic in recent years due to its potentially large impact on both image understanding and Web image search. In this paper, we target a...

Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, Wei-Yin...

claim paper

Read More »

click to vote

WWW
2007
ACM

114views Internet Technology» more WWW 2007»

Towards Deeper Understanding of the Search Interfaces of the Deep Web

14 years 9 months ago

Download www.cs.binghamton.edu

Many databases have become Web-accessible through form-based search interfaces (i.e., HTML forms) that allow users to specify complex and precise queries to access the underlying ...

Hai He, Weiyi Meng, Yiyao Lu, Clement T. Yu, Zongh...

claim paper

Read More »

click to vote

WWW
2005
ACM

122views Internet Technology» more WWW 2005»

Exploiting the deep web with DynaBot: matching, probing, and ranking

14 years 9 months ago

Download www.westga.edu

We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, p...

Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...

claim paper

Read More »

« Prev « First page 6 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers