Search Sciweavers | Sciweavers

874 search results - page 2 / 175

» Evaluation Methods for Focused Crawling

211

Voted

CN
1999

242views more CN 1999»

Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery

15 years 6 months ago

Download www.cse.iitb.ac.in

The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...

Soumen Chakrabarti, Martin van den Berg, Byron Dom

claim paper

Read More »

224

Voted

IR
2008

189views Natural Language Processing» more IR 2008»

Focused web crawling in the acquisition of comparable corpora

15 years 6 months ago

Download www.info.uta.fi

CLIR resources, such as dictionaries and parallel corpora, are scarce for special domains. Obtaining comparable corpora automatically for such domains could be an answer to this p...

Tuomas Talvensaari, Ari Pirkola, Kalervo Järv...

claim paper

Read More »

174

Voted

SAC
2003
ACM

133views Applied Computing» more SAC 2003»

Ontology-Focused Crawling of Web Documents

15 years 12 months ago

Download dspc11.cs.ccu.edu.tw

The Web, the largest unstructured database of the world, has greatly improved access to documents. However, documents on the Web are largely disorganized. Due to the distributed n...

Marc Ehrig, Alexander Maedche

claim paper

Read More »

167

click to vote

WWW
2005
ACM

228views Internet Technology» more WWW 2005»

Focused crawling by exploiting anchor text using decision tree

16 years 7 months ago

Download www.www2005.org

Focused crawlers are considered as a promising way to tackle the scalability problem of topic-oriented or personalized search engines. To design a focused crawler, the choice of s...

Jun Li, Kazutaka Furuse, Kazunori Yamaguchi

claim paper

Read More »

224

Voted

WIDM
2004
ACM

156views Internet Technology» more WIDM 2004»

Probabilistic models for focused web crawling

16 years 4 days ago

Download users.cs.dal.ca

A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...

Hongyu Liu, Evangelos E. Milios, Jeannette Janssen

claim paper

Read More »

« Prev « First page 2 / 175 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers