Sciweavers

472 search results - page 60 / 95
» Crawling the Hidden Web
Sort
View
AAAI
2007
13 years 10 months ago
Mining Web Query Hierarchies from Clickthrough Data
In this paper, we propose to mine query hierarchies from clickthrough data, which is within the larger area of automatic acquisition of knowledge from the Web. When a user submits...
Dou Shen, Min Qin, Weizhu Chen, Qiang Yang, Zheng ...
CIDR
2009
129views Algorithms» more  CIDR 2009»
13 years 8 months ago
Extracting and Querying a Comprehensive Web Database
Recent research in domain-independent information extraction holds the promise of an automatically-constructed structured database derived from the Web. A query system based on th...
Michael J. Cafarella
SIGIR
2008
ACM
13 years 7 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
WWW
2007
ACM
14 years 8 months ago
Combining classifiers to identify online databases
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
Luciano Barbosa, Juliana Freire
ICIP
2000
IEEE
14 years 9 months ago
Efficient Video Similarity Measurement and Search
We consider the use of meta-data and/or video-domain methods to detect similar videos on the web. Meta-data is extracted from the textual and hyperlink information associated with...
Sen-Ching S. Cheung, Avideh Zakhor