Sciweavers

139 search results - page 13 / 28
» An Approach to Identify Duplicated Web Pages
Sort
View
CIKM
2009
Springer
13 years 8 months ago
Improving search engines using human computation games
Work on evaluating and improving the relevance of web search engines typically use human relevance judgments or clickthrough data. Both these methods look at the problem of learni...
Hao Ma, Raman Chandrasekar, Chris Quirk, Abhishek ...
WIDM
2003
ACM
14 years 29 days ago
Schema-guided wrapper maintenance for web-data extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant t...
Xiaofeng Meng, Dongdong Hu, Chen Li
NIPS
2000
13 years 9 months ago
The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity
We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is...
David A. Cohn, Thomas Hofmann
WWW
2003
ACM
14 years 8 months ago
Monitoring the dynamic web to respond to continuous queries
Continuous queries are queries for which responses given to users must be continuously updated, as the sources of interest get updated. Such queries occur, for instance, during on...
Sandeep Pandey, Krithi Ramamritham, Soumen Chakrab...
WEBDB
2010
Springer
178views Database» more  WEBDB 2010»
14 years 23 days ago
Using Latent-Structure to Detect Objects on the Web
An important requirement for emerging applications which aim to locate and integrate content distributed over the Web is to identify pages that are relevant for a given domain or ...
Luciano Barbosa, Juliana Freire