Search Sciweavers | Sciweavers

38 search results - page 6 / 8

» The indexable web is more than 11.5 billion pages

click to vote

CN
1999

242views more CN 1999»

Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery

13 years 10 months ago

Download www.cse.iitb.ac.in

The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...

Soumen Chakrabarti, Martin van den Berg, Byron Dom

claim paper

Read More »

click to vote

SIGIR
2008
ACM

176views Information Technology» more SIGIR 2008»

SpotSigs: robust and efficient near duplicate detection in large web collections

13 years 10 months ago

Download ilpubs.stanford.edu

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...

Martin Theobald, Jonathan Siddharth, Andreas Paepc...

claim paper

Read More »

click to vote

CLEF
2005
Springer

115views Information Technology» more CLEF 2005»

EuroGOV: Engineering a Multilingual Web Corpus

14 years 3 months ago

Download www.clef-campaign.org

EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...

Börkur Sigurbjörnsson, Jaap Kamps, Maart...

claim paper

Read More »

click to vote

TREC
2004

129views Information Technology» more TREC 2004»

Experiments with Web QA System and TREC 2004 Questions

13 years 11 months ago

Download trec.nist.gov

We describe our first participation in TREC. We only competed in the Question Answering (QA) category and limited our runs to factoids. Our approach was to use our open domain QA ...

Dmitri Roussinov, Yin Ding, Jose Antonio Robles-Fl...

claim paper

Read More »

click to vote

WWW
2005
ACM

114views Internet Technology» more WWW 2005»

The semantic webscape: a view of the semantic web

14 years 11 months ago

Download www.www2005.org

It has been a few years since the semantic Web was initiated by W3C, but its status has not been quantitatively measured. It is crucial to understand the status at this early stag...

Juhnyoung Lee, Richard Goodwin

claim paper

Read More »

« Prev « First page 6 / 8 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers