Search Sciweavers | Sciweavers

25

MAICS
2004

219views Artificial Intelligence» more MAICS 2004»

Creation of a Style Independent Intelligent Autonomous Citation Indexer to Support Academic Research

13 years 9 months ago

This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. B...

Eric G. Berkowitz, Mohamed Reda Elkhadiri

claim paper

Read More »

58

click to vote

SIGMOD
2006
ACM

232views Database» more SIGMOD 2006»

To search or to crawl?: towards a query optimizer for text-centric tasks

14 years 8 months ago

Download pages.stern.nyu.edu

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive...

Panagiotis G. Ipeirotis, Eugene Agichtein, Pranay ...

claim paper

Read More »

19

click to vote

SIGMOD
2000
ACM

85views Database» more SIGMOD 2000»

Finding Replicated Web Collections

14 years 7 days ago

Download ilpubs.stanford.edu

Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....

Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...

claim paper

Read More »

21

click to vote

SPIRE
1999
Springer

105views Information Technology» more SPIRE 1999»

CoBWeb - A Crawler for the Brazilian Web

14 years 4 days ago

Download homepages.dcc.ufmg.br

One of the key components of current Web search engines is the document collector. This paper describes CoBWeb, an automatic document collector, whose architecture is distributed ...

Altigran Soares da Silva, Eveline A. Veloso, Paulo...

claim paper

Read More »

22

click to vote

NSDI
2010

194views Computer Networks» more NSDI 2010»

The Architecture and Implementation of an Extensible Web Crawler

13 years 9 months ago

Download www.usenix.org

Many Web services operate their own Web crawlers to discover data of interest, despite the fact that largescale, timely crawling is complex, operationally intensive, and expensive...

Jonathan M. Hsieh, Steven D. Gribble, Henry M. Lev...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers