Search Sciweavers | Sciweavers

171

WWW
2008
ACM

103views Internet Technology» more WWW 2008»

Low-load server crawler: design and evaluation

16 years 7 months ago

Download www2008.org

This paper proposes a method of crawling Web servers connected to the Internet without imposing a high processing load. We are using the crawler for a field survey of the digital ...

Katsuko T. Nakahira, Tetsuya Hoshino, Yoshiki Mika...

claim paper

Read More »

170

click to vote

WWW
2007
ACM

126views Internet Technology» more WWW 2007»

Crawling multiple UDDI business registries

16 years 7 months ago

Download www2007.org

As Web services proliferate, size and magnitude of UDDI Business Registries (UBRs) are likely to increase. The ability to discover Web services of interest then across multiple UB...

Eyhab Al-Masri, Qusay H. Mahmoud

claim paper

Read More »

168

click to vote

WWW
2006
ACM

139views Internet Technology» more WWW 2006»

Do not crawl in the DUST: different URLs with similar text

16 years 28 days ago

Download www2007.org

We consider the problem of dust: Diﬀerent URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...

Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar

claim paper

Read More »

199

click to vote

SIGIR
2003
ACM

159views Information Technology» more SIGIR 2003»

Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web

16 years 7 days ago

Download www.aameeksingh.com

This paper describes a decentralized peer-to-peer model for building a Web crawler. Most of the current systems use a centralized client-server model, in which the crawl is done by...

Aameek Singh, Mudhakar Srivatsa, Ling Liu, Todd Mi...

claim paper

Read More »

152

click to vote

SIGIR
2008
ACM

116views Information Technology» more SIGIR 2008»

Exploring traversal strategy for web forum crawling

15 years 7 months ago

Download research.microsoft.com

In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...

Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers