Sciweavers

472 search results - page 46 / 95
» Crawling the Hidden Web
Sort
View
WWW
2005
ACM
14 years 8 months ago
The infocious web search engine: improving web searching through linguistic analysis
In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities pre...
Alexandros Ntoulas, Gerald Chao, Junghoo Cho
PVLDB
2008
141views more  PVLDB 2008»
13 years 7 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
JCDL
2006
ACM
128views Education» more  JCDL 2006»
14 years 1 months ago
Building a research library for the history of the web
This paper describes the building of a research library for studying the Web, especially research on how the structure and content of the Web change over time. The library is part...
William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blaze...
MM
2004
ACM
112views Multimedia» more  MM 2004»
14 years 1 months ago
Multi-model similarity propagation and its application for web image retrieval
In this paper, we propose an iterative similarity propagation approach to explore the inter-relationships between Web images and their textual annotations for image retrieval. By ...
Xin-Jing Wang, Wei-Ying Ma, Gui-Rong Xue, Xing Li
LAWEB
2003
IEEE
14 years 29 days ago
On the Evolution of Clusters of Near-Duplicate Web Pages
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
Dennis Fetterly, Mark Manasse, Marc Najork