Search Sciweavers | Sciweavers

131

Voted

WWW
2005
ACM

200views Internet Technology» more WWW 2005»

The infocious web search engine: improving web searching through linguistic analysis

16 years 4 months ago

In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities pre...

Alexandros Ntoulas, Gerald Chao, Junghoo Cho

claim paper

Read More »

149

Voted

PVLDB
2008

141views more PVLDB 2008»

WebTables: exploring the power of tables on the web

15 years 3 months ago

Download turing.cs.washington.edu

The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...

Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...

claim paper

Read More »

116

Voted

JCDL
2006
ACM

128views Education» more JCDL 2006»

Building a research library for the history of the web

15 years 9 months ago

Download delivery.acm.org

This paper describes the building of a research library for studying the Web, especially research on how the structure and content of the Web change over time. The library is part...

William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blaze...

claim paper

Read More »

127

Voted

MM
2004
ACM

112views Multimedia» more MM 2004»

15 years 9 months ago

Multi-model similarity propagation and its application for web image retrieval

Download research.microsoft.com

In this paper, we propose an iterative similarity propagation approach to explore the inter-relationships between Web images and their textual annotations for image retrieval. By ...

Xin-Jing Wang, Wei-Ying Ma, Gui-Rong Xue, Xing Li

claim paper

Read More »

102

Voted

LAWEB
2003
IEEE

96views Internet Technology» more LAWEB 2003»

On the Evolution of Clusters of Near-Duplicate Web Pages

15 years 8 months ago

Download research.microsoft.com

This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...

Dennis Fetterly, Mark Manasse, Marc Najork

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers