Search Sciweavers | Sciweavers

156

PDP
2008
IEEE

83views Distributed And Parallel Com...» more PDP 2008»

Bulk-Synchronous On-Line Crawling on Clusters of Computers

16 years 1 months ago

This paper describes the design of a crawler devised to perform the periodic retrieval of Web documents for a search engine able to accept on-line updates in a concurrent manner. ...

Mauricio Marín, Carolina Bonacic

claim paper

Read More »

315

Voted

CIKM
2011
Springer

259views Information Technology» more CIKM 2011»

Focusing on novelty: a crawling strategy to build diverse language models

14 years 6 months ago

Download www2.research.att.com

Word prediction performed by language models has an important role in many tasks as e.g. word sense disambiguation, speech recognition, hand-writing recognition, query spelling an...

Luciano Barbosa, Srinivas Bangalore

claim paper

Read More »

195

Voted

CIKM
2005
Springer

143views Information Technology» more CIKM 2005»

Focused crawling for both topical relevance and quality of medical information

16 years 7 days ago

Download research.microsoft.com

Subject-speciﬁc search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...

Thanh Tin Tang, David Hawking, Nick Craswell, Kath...

claim paper

Read More »

174

Voted

WEBI
2007
Springer

133views Internet Technology» more WEBI 2007»

Question Answering over Implicitly Structured Web Content

16 years 24 days ago

Download www.mathcs.emory.edu

Implicitly structured content on the Web such as HTML tables and lists can be extremely valuable for web search, question answering, and information retrieval, as the implicit str...

Eugene Agichtein, Chris Burges, Eric Brill

claim paper

Read More »

157

click to vote

GCC
2005
Springer

116views Distributed And Parallel Com...» more GCC 2005»

Parallel Web Spiders for Cooperative Information Gathering

16 years 7 days ago

Download www.semgrid.net

Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling proc...

Jiewen Luo, Zhongzhi Shi, Maoguang Wang, Wei Wang

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers