Search Sciweavers | Sciweavers

472 search results - page 58 / 95

» Crawling the Hidden Web

135

Voted

KDD
2008
ACM

183views Data Mining» more KDD 2008»

De-duping URLs via rewrite rules

16 years 4 months ago

Download research.yahoo.com

A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...

Anirban Dasgupta, Ravi Kumar, Amit Sasturkar

claim paper

Read More »

127

Voted

ISW
2009
Springer

106views Information Technology» more ISW 2009»

Automated Spyware Collection and Analysis

15 years 10 months ago

Download www.cs.ucsb.edu

Various online studies on the prevalence of spyware attest overwhelming numbers (up to 80%) of infected home computers. However, the term spyware is ambiguous and can refer to anyt...

Andreas Stamminger, Christopher Kruegel, Giovanni ...

claim paper

Read More »

136

Voted

SIGIR
2005
ACM

150views Information Technology» more SIGIR 2005»

Server selection methods in hybrid portal search

15 years 9 months ago

Download es.csiro.au

The TREC .GOV collection makes a valuable web testbed for distributed information retrieval methods because it is naturally partitioned and includes 725 web-oriented queries with ...

David Hawking, Paul Thomas

claim paper

Read More »

126

Voted

WWW
2008
ACM

163views Internet Technology» more WWW 2008»

Efficiently finding web services using a clustering semantic approach

16 years 4 months ago

Download www.cs.adelaide.edu.au

Efficiently finding Web services on the Web is a challenging issue in service-oriented computing. Currently, UDDI is a standard for publishing and discovery of Web services, and U...

Jiangang Ma, Yanchun Zhang, Jing He

claim paper

Read More »

119

Voted

WWW
2001
ACM

150views Internet Technology» more WWW 2001»

Effective Web data extraction with standard XML technologies

16 years 4 months ago

Download www10.org

We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...

Jussi Myllymaki

claim paper

Read More »

« Prev « First page 58 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers